Chimeric retinoid X receptors and their use in a novel ecdysone receptor-based inducible gene expression system

ABSTRACT

This invention relates to the field of biotechnology or genetic engineering. Specifically, this invention relates to the field of gene expression. More specifically, this invention relates to a novel ecdysone receptor/chimeric retinoid X receptor-based inducible gene expression system and methods of modulating gene expression in a host cell for applications such as gene therapy, large-scale production of proteins and antibodies, cell-based high throughput screening assays, functional genomics and regulation traits in transgenic organisms.

FIELD OF THE INVENTION

[0001] This invention relates to the field of biotechnology or geneticengineering. Specifically, this invention relates to the field of geneexpression. More specifically, this invention relates to a novelecdysone receptor/chimeric retinoid X receptor-based inducible geneexpression system and methods of modulating the expression of a genewithin a host cell using this inducible gene expression system.

BACKGROUND OF THE INVENTION

[0002] Various publications are cited herein, the disclosures of whichare incorporated by reference in their entireties. However, the citationof any reference herein should not be construed as an admission thatsuch reference is available as “Prior Art” to the instant application.

[0003] In the field of genetic engineering, precise control of geneexpression is a valuable tool for studying, manipulating, andcontrolling development and other physiological processes. Geneexpression is a complex biological process involving a number ofspecific protein-protein interactions. In order for gene expression tobe triggered, such that it produces the RNA necessary as the first stepin protein synthesis, a transcriptional activator must be brought intoproximity of a promoter that controls gene transcription. Typically, thetranscriptional activator itself is associated with a protein that hasat least one DNA binding domain that binds to DNA binding sites presentin the promoter regions of genes. Thus, for gene expression to occur, aprotein comprising a DNA binding domain and a transactivation domainlocated at an appropriate distance from the DNA binding domain must bebrought into the correct position in the promoter region of the gene.

[0004] The traditional transgenic approach utilizes a cell-type specificpromoter to drive the expression of a designed transgene. A DNAconstruct containing the transgene is first incorporated into a hostgenome. When triggered by a transcriptional activator, expression of thetransgene occurs in a given cell type.

[0005] Another means to regulate expression of foreign genes in cells isthrough inducible promoters. Examples of the use of such induciblepromoters include the PR1-a promoter, prokaryotic repressor-operatorsystems, immunosuppressive-immunophilin systems, and higher eukaryotictranscription activation systems such as steroid hormone receptorsystems and are described below.

[0006] The PR1-a promoter from tobacco is induced during the systemicacquired resistance response following pathogen attack. The use of PR1-amay be limited because it often responds to endogenous materials andexternal factors such as pathogens, UV-B radiation, and pollutants. Generegulation systems based on promoters induced by heat shock, interferonand heavy metals have been described (Wurn et al., 1986, Proc. Natl.Acad. Sci. USA 83: 5414-5418; Arnheiter et al., 1990, Cell 62: 51-61;Filmus et al., 1992, Nucleic Acids Research 20: 27550-27560). However,these systems have limitations due to their effect on expression ofnon-target genes. These systems are also leaky.

[0007] Prokaryotic repressor-operator systems utilize bacterialrepressor proteins and the unique operator DNA sequences to which theybind. Both the tetracycline (“Tet”) and lactose (“Lac”)repressor-operator systems from the bacterium Escherichia coli have beenused in plants and animals to control gene expression. In the Tetsystem, tetracycline binds to the TetR repressor protein, resulting in aconformational change which releases the repressor protein from theoperator which as a result allows transcription to occur. In the Lacsystem, a lac operon is activated in response to the presence oflactose, or synthetic analogs such as isopropyl-b-D-thiogalactoside.Unfortunately, the use of such systems is restricted by unstablechemistry of the ligands, i.e. tetracycline and lactose, their toxicity,their natural presence, or the relatively high levels required forinduction or repression. For similar reasons, utility of such systems inanimals is limited.

[0008] Immunosuppressive molecules such as FK506, rapamycin andcyclosporine A can bind to immunophilins FKBP12, cyclophilin, etc. Usingthis information, a general strategy has been devised to bring togetherany two proteins simply by placing FK506 on each of the two proteins orby placing FK506 on one and cyclosporine A on another one. A synthetichomodimer of FK506 (FK1012) or a compound resulted from fusion ofFK506-cyclosporine (FKCsA) can then be used to induce dimerization ofthese molecules (Spencer et al., 1993, Science 262:1019-24; Belshaw etal., 1996, Proc Natl Acad Sci USA 93:4604-7). Gal4 DNA binding domainfused to FKBP12 and VP16 activator domain fused to cyclophilin, andFKCsA compound were used to show heterodimerization and activation of areporter gene under the control of a promoter containing Gal4 bindingsites. Unfortunately, this system includes immunosuppressants that canhave unwanted side effects and therefore, limits its use for variousmammalian gene switch applications.

[0009] Higher eukaryotic transcription activation systems such assteroid hormone receptor systems have also been employed. Steroidhormone receptors are members of the nuclear receptor superfamily andare found in vertebrate and invertebrate cells. Unfortunately, use ofsteroidal compounds that activate the receptors for the regulation ofgene expression, particularly in plants and mammals, is limited due totheir involvement in many other natural biological pathways in suchorganisms. In order to overcome such difficulties, an alternative systemhas been developed using insect ecdysone receptors (EcR).

[0010] Growth, molting, and development in insects are regulated by theecdysone steroid hormone (molting hormone) and the juvenile hormones(Dhadialla, et al., 1998, Annu. Rev. Entomol. 43: 545-569). Themolecular target for ecdysone in insects consists of at least ecdysonereceptor (EcR) and ultraspiracle protein (USP). EcR is a member of thenuclear steroid receptor super family that is characterized by signatureDNA and ligand binding domains, and an activation domain (Koelle et al.1991, Cell, 67:59-77). EcR receptors are responsive to a number ofsteroidal compounds such as ponasterone A and muristerone A. Recently,non-steroidal compounds with ecdysteroid agonist activity have beendescribed, including the commercially available insecticidestebufenozide and methoxyfenozide that are marketed world wide by Rohmand Haas Company (see International Patent Application No.PCT/EP96/00686 and U.S. Pat. No. 5,530,028). Both analogs haveexceptional safety profiles to other organisms.

[0011] International Patent Applications No. PCT/US97/05330 (WO97/38117) and PCT/US99/08381 (WO99/58155) disclose methods formodulating the expression of an exogenous gene in which a DNA constructcomprising the exogenous gene and an ecdysone response element isactivated by a second DNA construct comprising an ecdysone receptorthat, in the presence of a ligand therefor, and optionally in thepresence of a receptor capable of acting as a silent partner, binds tothe ecdysone response element to induce gene expression. The ecdysonereceptor of choice was isolated from Drosophila melanogaster. Typically,such systems require the presence of the silent partner, preferablyretinoid X receptor (RXR), in order to provide optimum activation. Inmammalian cells, insect ecdysone receptor (EcR) heterodimerizes withretinoid X receptor (RXR) and regulates expression of target genes in aligand dependent manner. International Patent Application No.PCT/US98/14215 (WO 99/02683) discloses that the ecdysone receptorisolated from the silk moth Bombyx mori is functional in mammaliansystems without the need for an exogenous dimer partner.

[0012] U.S. Pat. No. 5,880,333 discloses a Drosophila melanogaster EcRand ultraspiracle (USP) heterodimer system used in plants in which thetransactivation domain and the DNA binding domain are positioned on twodifferent hybrid proteins. Unfortunately, this system is not effectivefor inducing reporter gene expression in animal cells (for comparison,see Example 1.2, below). hi each of these cases, the transactivationdomain and the DNA binding domain (either as native EcR as inInternational Patent Application No. PCT/US98/14215 or as modified EcRas in International Patent Application No. PCT/US97/05330) wereincorporated into a single molecule and the other heterodimericpartners, either USP or RXR, were used in their native state.

[0013] Drawbacks of the above described EcR-based gene regulationsystems include a considerable background activity in the absence ofligands and non-applicability of these systems for use in both plantsand animals (see U.S. Pat. No. 5,880,333). For most applications thatrely on modulating gene expression, these EcR-based systems areundesirable. Therefore, a need exists in the art for improved systems toprecisely modulate the expression of exogenous genes in both plants andanimals. Such improved systems would be useful for applications such asgene therapy, large-scale production of proteins and antibodies,cell-based high throughput screening assays, functional genomics andregulation of traits in transgenic animals. Improved systems that aresimple, compact, and dependent on ligands that are relativelyinexpensive, readily available, and of low toxicity to the host wouldprove useful for regulating biological systems.

[0014] Recently, Applicants have shown that an ecdysone receptor-basedinducible gene expression system in which the transactivation and DNAbinding domains are separated from each other by placing them on twodifferent proteins results in greatly reduced background activity in theabsence of a ligand and significantly increased activity over backgroundin the presence of a ligand (pending application PCT/US01/09050,incorporated herein in its entirety by reference). This two-hybridsystem is a significantly improved inducible gene expression modulationsystem compared to the two systems disclosed in applicationsPCT/US97/05330 and PCT/US98/14215.

[0015] Applicants previously demonstrated that an ecdysonereceptor-based gene expression system in partnership with a dipteran(Drosophila melanogaster) or a lepidopteran (Choristoneura fumiferana)ultraspiracle protein (USP) is constitutively expressed in mammaliancells, while an ecdysone receptor-based gene expression system inpartnership with a vertebrate retinoid X receptor (RXR) is inducible inmammalian cells (pending application PCT/US01/09050). Applicants haverecently made the surprising discovery that a non-Dipteran andnon-Lepidopteran invertebrate RXR can function similar to vertebrate RXRin an ecdysone receptor-based inducible gene expression system (USapplication filed concurrently herewith).

[0016] Applicants have now shown that a chimeric RXR ligand bindingdomain, comprising at least two polypeptide fragments, wherein the firstpolypeptide fragment is from one species of vertebrate/invertebrate RXRand the second polypeptide fragment is from a different species ofvertebrate/invertebrate RXR, whereby a vertebrate/invertebrate chimericRXR ligand binding domain, a vertebrate/vertebrate chimeric RXR ligandbinding domain, or an invertebrate/invertebrate chimeric RXR ligandbinding domain is produced, can function similar to or better thaneither the parental vertebrate RXR or the parental invertebrate RXR inan ecdysone receptor-based inducible gene expression system. Asdescribed herein, Applicants' novel ecdysone receptor/chimeric retinoidX receptor-based inducible gene expression system provides an induciblegene expression system in bacteria, fungi, yeast, animal, and mammaliancells that is characterized by increased ligand sensitivity andmagnitude of transactivation.

SUMMARY OF THE INVENTION

[0017] The present invention relates to a novel ecdysonereceptor/chimeric retinoid X receptor-based inducible gene expressionsystem, novel chimeric receptor polynucleotides and polypeptides for usein the novel inducible gene expression system, and methods of modulatingthe expression of a gene within a host cell using this inducible geneexpression system. In particular, Applicants' invention relates to anovel gene expression modulation system comprising a polynucleotideencoding a chimeric RXR ligand binding domain (LBD).

[0018] Specifically, the present invention relates to a gene expressionmodulation system comprising: a) a first gene expression cassette thatis capable of being expressed in a host cell comprising a polynucleotidethat encodes a first hybrid polypeptide comprising: i) a DNA-bindingdomain that recognizes a response element associated with a gene whoseexpression is to be modulated; and ii) an ecdysone receptor ligandbinding domain; and b) a second gene expression cassette that is capableof being expressed in the host cell comprising a polynucleotide sequencethat encodes a second hybrid polypeptide comprising: i) atransactivation domain; and ii) a chimeric retinoid X receptor ligandbinding domain.

[0019] The present invention also relates to a gene expressionmodulation system comprising: a) a first gene expression cassette thatis capable of being expressed in a host cell comprising a polynucleotidethat encodes a first hybrid polypeptide comprising: i) a DNA-bindingdomain that recognizes a response element associated with a gene whoseexpression is to be modulated; and ii) a chimeric retinoid X receptorligand binding domain; and b) a second gene expression cassette that iscapable of being expressed in the host cell comprising a polynucleotidesequence that encodes a second hybrid polypeptide comprising: i) atransactivation domain; and ii) an ecdysone receptor ligand bindingdomain.

[0020] The present invention also relates to a gene expressionmodulation system according to the invention further comprising c) athird gene expression cassette comprising: i) a response element towhich the DNA-binding domain of the first hybrid polypeptide binds; ii)a promoter that is activated by the transactivation domain of the secondhybrid polypeptide; and iii) a gene whose expression is to be modulated.

[0021] The present invention also relates to a gene expression cassettethat is capable of being expressed in a host cell, wherein the geneexpression cassette comprises a polynucleotide that encodes a hybridpolypeptide comprising either i) a DNA-binding domain that recognizes aresponse element associated with a gene whose expression is to bemodulated, or ii) a transactivation domain; and a chimeric retinoid Xreceptor ligand binding domain.

[0022] The present invention also relates to an isolated polynucleotidethat encodes a hybrid polypeptide comprising either i) a DNA-bindingdomain that recognizes a response element associated with a gene whoseexpression is to be modulated, or ii) a transactivation domain; and achimeric vertebrate and invertebrate retinoid X receptor ligand bindingdomain. The present invention also relates to a isolated hybridpolypeptide encoded by the isolated polynucleotide according to theinvention.

[0023] The present invention also relates to an isolated polynucleotideencoding a truncated chimeric RXR LBD. In a specific embodiment, theisolated polynucleotide encodes a truncated chimeric RXR LBD, whereinthe truncation mutation affects ligand binding activity or ligandsensitivity of the chimeric RXR LBD. In another specific embodiment, theisolated polynucleotide encodes a truncated chimeric RXR polypeptidecomprising a truncation mutation that increases ligand sensitivity of aheterodimer comprising the truncated chimeric RXR polypeptide and adimerization partner. In a specific embodiment, the dimerization partneris an ecdysone receptor polypeptide.

[0024] The present invention also relates to an isolated polypeptideencoded by a polynucleotide according to Applicants' invention.

[0025] The present invention also relates to an isolated hybridpolypeptide comprising either i) a DNA-binding domain that recognizes aresponse element associated with a gene whose expression is to bemodulated, or ii) a transactivation domain; and a chimeric retinoid Xreceptor ligand binding domain.

[0026] The present invention relates to an isolated truncated chimericRXR LBD comprising a truncation mutation, wherein the truncated chimericRXR LBD is encoded by a polynucleotide according to the invention.

[0027] Thus, the present invention also relates to an isolated truncatedchimeric RXR LBD comprising a truncation mutation that affects ligandbinding activity or ligand sensitivity of said truncated chimeric RXRLBD.

[0028] The present invention also relates to an isolated truncatedchimeric RXR LBD comprising a truncation mutation that increases ligandsensitivity of a heterodimer comprising the truncated chimeric RXR LBDand a dimerization partner. In a specific embodiment, the dimerizationpartner is an ecdysone receptor polypeptide.

[0029] Applicants' invention also relates to methods of modulating geneexpression in a host cell using a gene expression modulation systemaccording to the invention. Specifically, Applicants' invention providesa method of modulating the expression of a gene in a host cellcomprising the steps of: a) introducing into the host cell a geneexpression modulation system according to the invention; b) introducinginto the host cell a gene expression cassette comprising i) a responseelement comprising a domain recognized by the DNA binding domain fromthe first hybrid polypeptide; ii) a promoter that is activated by thetransactivation domain of the second hybrid polypeptide; and iii) a genewhose expression is to be modulated; and c) introducing into the hostcell a ligand; whereby upon introduction of the ligand into the host,expression of the gene of b)iii) is modulated.

[0030] Applicants' invention also provides a method of modulating theexpression of a gene in a host cell comprising a gene expressioncassette comprising a response element comprising a domain recognized bythe DNA binding domain from the first hybrid polypeptide; a promoterthat is activated by the transactivation domain of the second hybridpolypeptide; and a gene whose expression is to be modulated; wherein themethod comprises the steps of: a) introducing into the host cell a geneexpression modulation system according to the invention; and b)introducing into the host cell a ligand; whereby upon introduction ofthe ligand into the host, expression of the gene is modulated.

[0031] Applicants' invention also provides an isolated host cellcomprising an inducible gene expression system according to theinvention. The present invention also relates to an isolated host cellcomprising a gene expression cassette, a polynucleotide, or apolypeptide according to the invention. Accordingly, Applicants'invention also relates to a non-human organism comprising a host cellaccording to the invention.

BRIEF DESCRIPTION OF THE DRAWINGS

[0032]FIG. 1: Expression data of VP16LmUSP-EF, VP16MmRXRα-EF and threeindependent clones of VP16MmRXRα(1-7)-LmUSP (8-12)-EF in NIH3T3 cellsalong with GAL4CfEcR-CDEF and pFRLuc in the presence of non-steroid(GSE) ligand.

[0033]FIG. 2: Expression data of VP16LmUSP-EF, VP16MmRXRα-EF and twoindependent clones of VP16MmRXRα(1-7)-LmUSP (8-12)-EF in NIH3T3 cellsalong with GAL4CfEcR-CDEF and pFRLuc in the presence of non-steroid(GSE) ligand.

[0034]FIG. 3: Expression data of VP16LmUSP-EF, VP16MmRXRα-EF and twoindependent clones of VP16MmRXRα(1-7)-LmUSP (8-12)-EF in A549 cellsalong with GAL4CfEcR-CDEF and pFRLuc in the presence of non-steroid(GSE) ligand.

[0035]FIG. 4: Amino acid sequence alignments of the EF domains of sixvertebrate RXRs (A) and six invertebrate RXRs (B). B6, B8, B9, B10 andB11 denotes βchimera junctions. A1 denotes junction for αchimera.Helices 1-12 are denoted as H1-H12 and β pleated sheets are denoted asS1 and S2. F denotes the F domain junction.

[0036]FIG. 5: Expression data of GAL4CfEcR-CDEF/VP16chimeric RXR-basedgene switches 1.3-1.6 in NIH3T3 cells along with pFRLuc in the presenceof non-steroid (GSE) ligand.

[0037]FIG. 6: Expression data of gene switches comprising the DEFdomains of EcRs from CfEcR, DmEcR, TmEcR, or AmaEcR fused to GAL4 DNAbinding domain and the EF domains of RXR/USPs from CfUSP, DmUSP, LmUSP,MmRXRα, a chimera between MmRXRα and LmUSP (Chimera), AmaRXR1, orAmaRXR2 fused to a VP16 activation domain along with pFRLuc in NIH3T3cells in the presence of steroid (PonA) or non-steroid (GSE) ligand. Thedifferent RXR/USP constructs were compared in partnership withGAL4CfEcR-DEF.

[0038]FIG. 7: Expression data of gene switches comprising the DEFdomains of EcRs from CfEcR, DmEcR, TmEcR, or AmaEcR fused to GAL4 DNAbinding domain and the EF domains of RXR/USPs from CfUSP, DmUSP, LmUSP,MmRXRα, a chimera between MmRXRα and LmUSP (Chimera), AmaRXR1, orAmaRXR2 fused to a VP16 activation domain along with pFRLuc in NIH3T3cells in the presence of steroid (PonA) or non-steroid (GSE) ligand. Thedifferent RXR/USP constructs were compared in partnership withGAL4DmEcR-DEF.

[0039]FIG. 8: Expression data of gene switches comprising the DEFdomains of EcRs from CfEcR, DmEcR, TmEcR, or AmaEcR fused to GAL4 DNAbinding domain and the EF domains of RXR/USPs from CfUSP, DmUSP, LmUSP,MmRXRα, a chimera between MmRXRα and LmUSP (Chimera), AmaRXR1, orAmaRXR2 fused to a VP16 activation domain along with pFRLuc in NIH3T3cells in the presence of steroid (PonA) or non-steroid (GSE) ligand. Thedifferent EcR constructs were compared in partnership with a chimericRXR-EF (MmRXRα-(1-7)-LmUSP(8-12)-EF).

[0040]FIG. 9: Expression data of VP16/MmRXRα-EF (aRXR), VP16/Chimerabetween MmRXRα-EF and LmUSP-EF (MmRXRα-(1-7)-LmUSP(8-12)-EF; aCh7),VP16/LmUSP-EF (LmUSP) and three independent clones from each of fiveVP16/chimeras between HsRXRβ-EF and LmUSP-EF (see Table 1 for chimericRXR constructs; bRXRCh6, bRXRCh8, bRXRCh9, bRXRCh10, and bRXRCh11) weretransfected into NIH3T3 cells along with GAL4/CfEcR-DEF and pFRLuc. Thetransfected cells were grown in the presence of 0, 0.2, 1 and 10 μMnon-steroidal ligand (GSE). The reporter activity was quantified 48hours after adding ligands.

[0041]FIG. 10: Expression data of VP16/MmRXRα-EF (aRXR), VP16/Chimerabetween MmRXRα-EF and LmUSP-EF (MmRXRα-(1-7)-LmUSP(8-12)-EF; aCh7),VP16/LmUSP-EF (LmUSP) and three independent clones from each of fiveVP16/chimeras between HsRXRβ-EF and LmUSP-EF (see Table 1 for chimericRXR constructs; bRXRCh6, bRXRCh8, bRXRCh9, bRXRCh10, and bRXRCh11) weretransfected into NIH3T3 cells along with GAL4/CfEcR-DEF and pFRLuc. Thetransfected cells were grown in the presence of 0, 0.2, 1 and 10 μMsteroid ligand (PonA) or 0, 0.04, 0.2, 1, and 10 μM non-steroidal ligand(GSE). The reporter activity was quantified 48 hours after addingligands.

[0042]FIG. 11: Expression data of VP16/MmRXRα-EF (aRXR), VP16/Chimerabetween MmRXRα-EF and LmUSP-EF (MmRXRα-(1-7)-LmUSP(8-12)-EF; aCh7),VP16/LmUSP-EF (LmUSP) and three independent clones from each of fiveVP16/chimeras between HsRXRβ-EF and LmUSP-EF (see Table 1 for chimericRXR constructs; bRXRCh6, bRXRCh8, bRXRCh9, bRXRCh10, and bRXRCh11) weretransfected into NIH3T3 cells along with GAL4/DmEcR-DEF and pFRLuc. Thetransfected cells were grown in the presence of 0, 0.2, 1 and 10 μMsteroid ligand (PonA) or 0, 0.04, 0.2, 1, and 10 μM non-steroidal ligand(GSE). The reporter activity was quantified 48 hours after addingligands.

[0043]FIG. 12: Effect of 9-cis-retinoic acid on transactivationpotential of the GAL4CfEcR-DEF/VP16HsRXRβ-(1-8)-LmUSP-(9-12)-EF(βchimera 9) gene switch along with pFRLuc in NIH 3T3 cells in thepresence of non-steroid (GSE) and 9-cis-retinoic acid (9Cis) for 48hours.

DETAILED DESCRIPTION OF THE INVENTION

[0044] Applicants have now shown that chimeric RXR ligand bindingdomains are functional within an EcR-based inducible gene expressionmodulation system in mammalian cells and that these chimeric RXR LBDsexhibit advantageous ligand sensitivities and transactivation abilities.Thus, Applicants' invention provides a novel ecdysone receptor-basedinducible gene expression system comprising a chimeric retinoid Xreceptor ligand binding domain that is useful for modulating expressionof a gene of interest in a host cell. In a particularly desirableembodiment, Applicants' invention provides an inducible gene expressionsystem that has a reduced level of background gene expression andresponds to submicromolar concentrations of non-steroidal ligand. Thus,Applicants' novel inducible gene expression system and its use inmethods of modulating gene expression in a host cell overcome thelimitations of currently available inducible expression systems andprovide the skilled artisan with an effective means to control geneexpression.

[0045] The present invention is useful for applications such as genetherapy, large scale production of proteins and antibodies, cell-basedhigh throughput screening assays, functional genomics, proteomics, andmetabolomics analyses and regulation of traits in transgenic organisms,where control of gene expression levels is desirable. An advantage ofApplicants' invention is that it provides a means to regulate geneexpression and to tailor expression levels to suit the user'srequirements.

DEFINITIONS

[0046] In this disclosure, a number of terms and abbreviations are used.The following definitions are provided and should be helpful inunderstanding the scope and practice of the present invention.

[0047] In a specific embodiment, the term “about” or “approximately”means within 20%, preferably within 10%, more preferably within 5%, andeven more preferably within 1% of a given value or range.

[0048] The term “substantially free” means that a composition comprising“A” (where “A” is a single protein, DNA molecule, vector, recombinanthost cell, etc.) is substantially free of “B” (where “B” comprises oneor more contaminating proteins, DNA molecules, vectors, etc.) when atleast about 75% by weight of the proteins, DNA, vectors (depending onthe category of species to which A and B belong) in the composition is“A”. Preferably, “A” comprises at least about 90% by weight of the A+Bspecies in the composition, most preferably at least about 99% byweight. It is also preferred that a composition, which is substantiallyfree of contamination, contain only a single molecular weight specieshaving the activity or characteristic of the species of interest.

[0049] The term “isolated” for the purposes of the present inventiondesignates a biological material (nucleic acid or protein) that has beenremoved from its original environment (the environment in which it isnaturally present). For example, a polynucleotide present in the naturalstate in a plant or an animal is not isolated, however the samepolynucleotide separated from the adjacent nucleic acids in which it isnaturally present, is considered “isolated”. The term “purified” doesnot require the material to be present in a form exhibiting absolutepurity, exclusive of the presence of other compounds. It is rather arelative definition.

[0050] A polynucleotide is in the “purified” state after purification ofthe starting material or of the natural material by at least one orderof magnitude, preferably 2 or 3 and preferably 4 or 5 orders ofmagnitude.

[0051] A “nucleic acid” is a polymeric compound comprised of covalentlylinked subunits called nucleotides. Nucleic acid includespolyribonucleic acid (RNA) and polydeoxyribonucleic acid (DNA), both ofwhich may be single-stranded or double-stranded. DNA includes but is notlimited to cDNA, genomic DNA, plasmids DNA, synthetic DNA, andsemi-synthetic DNA. DNA may be linear, circular, or supercoiled.

[0052] A “nucleic acid molecule” refers to the phosphate ester polymericform of ribonucleosides (adenosine, guanosine, uridine or cytidine; “RNAmolecules”) or deoxyribonucleosides (deoxyadenosine, deoxyguanosine,deoxythymidine, or deoxycytidine; “DNA molecules”), or any phosphoesteranalogs thereof, such as phosphorothioates and thioesters, in eithersingle stranded form, or a double-stranded helix. Double strandedDNA-DNA, DNA-RNA and RNA-RNA helices are possible. The term nucleic acidmolecule, and in particular DNA or RNA molecule, refers only to theprimary and secondary structure of the molecule, and does not limit itto any particular tertiary forms. Thus, this term includesdouble-stranded DNA found, inter alia, in linear or circular DNAmolecules (e.g., restriction fragments), plasmids, and chromosomes. Indiscussing the structure of particular double-stranded DNA molecules,sequences may be described herein according to the normal convention ofgiving only the sequence in the 5′ to 3′ direction along thenon-transcribed strand of DNA (i.e., the strand having a sequencehomologous to the mRNA). A “recombinant DNA molecule” is a DNA moleculethat has undergone a molecular biological manipulation.

[0053] The term “fragment” will be understood to mean a nucleotidesequence of reduced length relative to the reference nucleic acid andcomprising, over the common portion, a nucleotide sequence identical tothe reference nucleic acid. Such a nucleic acid fragment according tothe invention may be, where appropriate, included in a largerpolynucleotide of which it is a constituent. Such fragments comprise, oralternatively consist of, oligonucleotides ranging in length from atleast 6, 8, 9, 10, 12, 15, 18, 20, 21, 22, 23, 24, 25, 30, 39, 40, 42,45, 48, 50, 51, 54, 57, 60, 63, 66, 70, 75, 78, 80, 90, 100, 105, 120,135, 150, 200, 300, 500, 720, 900, 1000 or 1500 consecutive nucleotidesof a nucleic acid according to the invention.

[0054] As used herein, an “isolated nucleic acid fragment” is a polymerof RNA or DNA that is single- or double-stranded, optionally containingsynthetic, non-natural or altered nucleotide bases. An isolated nucleicacid fragment in the form of a polymer of DNA may be comprised of one ormore segments of cDNA, genomic DNA or synthetic DNA.

[0055] A “gene” refers to an assembly of nucleotides that encode apolypeptide, and includes cDNA and genomic DNA nucleic acids. “Gene”also refers to a nucleic acid fragment that expresses a specific proteinor polypeptide, including regulatory sequences preceding (5′ non-codingsequences) and following (3′ non-coding sequences) the coding sequence.“Native gene” refers to a gene as found in nature with its ownregulatory sequences. “Chimeric gene” refers to any gene that is not anative gene, comprising regulatory and/or coding sequences that are notfound together in nature. Accordingly, a chimeric gene may compriseregulatory sequences and coding sequences that are derived fromdifferent sources, or regulatory sequences and coding sequences derivedfrom the same source, but arranged in a manner different than that foundin nature. A chimeric gene may comprise coding sequences derived fromdifferent sources and/or regulatory sequences derived from differentsources. “Endogenous gene” refers to a native gene in its naturallocation in the genome of an organism. A “foreign” gene or“heterologous” gene refers to a gene not normally found in the hostorganism, but that is introduced into the host organism by genetransfer. Foreign genes can comprise native genes inserted into anon-native organism, or chimeric genes. A “transgene” is a gene that hasbeen introduced into the genome by a transformation procedure.

[0056] “Heterologous” DNA refers to DNA not naturally located in thecell, or in a chromosomal site of the cell. Preferably, the heterologousDNA includes a gene foreign to the cell.

[0057] The term “genome” includes chromosomal as well as mitochondrial,chloroplast and viral DNA or RNA.

[0058] A nucleic acid molecule is “hybridizable” to another nucleic acidmolecule, such as a cDNA, genomic DNA, or RNA, when a single strandedform of the nucleic acid molecule can anneal to the other nucleic acidmolecule under the appropriate conditions of temperature and solutionionic strength (see Sambrook et al., 1989 infra). Hybridization andwashing conditions are well known and exemplified in Sambrook, J.,Fritsch, E. F. and Maniatis, T. Molecular Cloning: A Laboratory Manual,Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor(1989), particularly Chapter 11 and Table 11.1 therein (entirelyincorporated herein by reference). The conditions of temperature andionic strength determine the “stringency” of the hybridization.

[0059] Stringency conditions can be adjusted to screen for moderatelysimilar fragments, such as homologous sequences from distantly relatedorganisms, to highly similar fragments, such as genes that duplicatefunctional enzymes from closely related organisms. For preliminaryscreening for homologous nucleic acids, low stringency hybridizationconditions, corresponding to a T_(m) of 55°, can be used, e.g., 5×SSC,0.1% SDS, 0.25% milk, and no formamide; or 30% formamide, 5×SSC, 0.5%SDS). Moderate stringency hybridization conditions correspond to ahigher T_(m), e.g., 40% formamide, with 5× or 6×SCC. High stringencyhybridization conditions correspond to the highest T_(m), e.g., 50%formamide, 5× or 6×SCC. Hybridization requires that the two nucleicacids contain complementary sequences, although depending on thestringency of the hybridization, mismatches between bases are possible.

[0060] The term “complementary” is used to describe the relationshipbetween nucleotide bases that are capable of hybridizing to one another.For example, with respect to DNA, adenosine is complementary to thymineand cytosine is complementary to guanine. Accordingly, the instantinvention also includes isolated nucleic acid fragments that arecomplementary to the complete sequences as disclosed or used herein aswell as those substantially similar nucleic acid sequences.

[0061] In a specific embodiment, the term “standard hybridizationconditions” refers to a T_(m) of 55° C., and utilizes conditions as setforth above. In a preferred embodiment, the T_(m) is 60° C.; in a morepreferred embodiment, the T_(m) is 65° C.

[0062] Post-hybridization washes also determine stringency conditions.One set of preferred conditions uses a series of washes starting with6×SSC, 0.5% SDS at room temperature for 15 minutes (min), then repeatedwith 2×SSC, 0.5% SDS at 45° C. for 30 minutes, and then repeated twicewith 0.2×SSC, 0.5% SDS at 50° C. for 30 minutes. A more preferred set ofstringent conditions uses higher temperatures in which the washes areidentical to those above except for the temperature of the final two 30min washes in 0.2×SSC, 0.5% SDS was increased to 60° C. Anotherpreferred set of highly stringent conditions uses two final washes in0.1×SSC, 0.1% SDS at 65° C. Hybridization requires that the two nucleicacids comprise complementary sequences, although depending on thestringency of the hybridization, mismatches between bases are possible.

[0063] The appropriate stringency for hybridizing nucleic acids dependson the length of the nucleic acids and the degree of complementation,variables well known in the art. The greater the degree of similarity orhomology between two nucleotide sequences, the greater the value ofT_(m) for hybrids of nucleic acids having those sequences. The relativestability (corresponding to higher T_(m)) of nucleic acid hybridizationsdecreases in the following order: RNA:RNA, DNA:RNA, DNA:DNA. For hybridsof greater than 100 nucleotides in length, equations for calculatingT_(m) have been derived (see Sambrook et al., supra, 9.50-0.51). Forhybridization with shorter nucleic acids, i.e., oligonucleotides, theposition of mismatches becomes more important, and the length of theoligonucleotide determines its specificity (see Sambrook et al., supra,11.7-11.8).

[0064] In one embodiment the length for a hybridizable nucleic acid isat least about 10 nucleotides. Preferable a minimum length for ahybridizable nucleic acid is at least about 15 nucleotides; morepreferably at least about 20 nucleotides; and most preferably the lengthis at least 30 nucleotides. Furthermore, the skilled artisan willrecognize that the temperature and wash solution salt concentration maybe adjusted as necessary according to factors such as length of theprobe.

[0065] The term “probe” refers to a single-stranded nucleic acidmolecule that can base pair with a complementary single stranded targetnucleic acid to form a double-stranded molecule. As used herein, theterm “oligonucleotide” refers to a nucleic acid, generally of at least18 nucleotides, that is hybridizable to a genomic DNA molecule, a cDNAmolecule, a plasmid DNA or an mRNA molecule. Oligonucleotides can belabeled, e.g., with ³²P-nucleotides or nucleotides to which a label,such as biotin, has been covalently conjugated. A labeledoligonucleotide can be used as a probe to detect the presence of anucleic acid. Oligonucleotides (one or both of which may be labeled) canbe used as PCR primers, either for cloning fall length or a fragment ofa nucleic acid, or to detect the presence of a nucleic acid. Anoligonucleotide can also be used to form a triple helix with a DNAmolecule. Generally, oligonucleotides are prepared synthetically,preferably on a nucleic acid synthesizer. Accordingly, oligonucleotidescan be prepared with non-naturally occurring phosphoester analog bonds,such as thioester bonds, etc.

[0066] A “primer” is an oligonucleotide that hybridizes to a targetnucleic acid sequence to create a double stranded nucleic acid regionthat can serve as an initiation point for DNA synthesis under suitableconditions. Such primers may be used in a polymerase chain reaction.

[0067] “Polymerase chain reaction” is abbreviated PCR and means an invitro method for enzymatically amplifying specific nucleic acidsequences. PCR involves a repetitive series of temperature cycles witheach cycle comprising three stages: denaturation of the template nucleicacid to separate the strands of the target molecule, annealing a singlestranded PCR oligonucleotide primer to the template nucleic acid, andextension of the annealed primer(s) by DNA polymerase. PCR provides ameans to detect the presence of the target molecule and, underquantitative or semi-quantitative conditions, to determine the relativeamount of that target molecule within the starting pool of nucleicacids.

[0068] “Reverse transcription-polymerase chain reaction” is abbreviatedRT-PCR and means an in vitro method for enzymatically producing a targetcDNA molecule or molecules from an RNA molecule or molecules, followedby enzymatic amplification of a specific nucleic acid sequence orsequences within the target cDNA molecule or molecules as describedabove. RT-PCR also provides a means to detect the presence of the targetmolecule and, under quantitative or semi-quantitative conditions, todetermine the relative amount of that target molecule within thestarting pool of nucleic acids.

[0069] A DNA “coding sequence” is a double-stranded DNA sequence that istranscribed and translated into a polypeptide in a cell in vitro or invivo when placed under the control of appropriate regulatory sequences.“Suitable regulatory sequences” refer to nucleotide sequences locatedupstream (5′ non-coding sequences), within, or downstream (3′ non-codingsequences) of a coding sequence, and which influence the transcription,RNA processing or stability, or translation of the associated codingsequence. Regulatory sequences may include promoters, translation leadersequences, introns, polyadenylation recognition sequences, RNAprocessing site, effector binding site and stem-loop structure. Theboundaries of the coding sequence are determined by a start codon at the5′ (amino) terminus and a translation stop codon at the 3′ (carboxyl)terminus. A coding sequence can include, but is not limited to,prokaryotic sequences, cDNA from mRNA, genomic DNA sequences, and evensynthetic DNA sequences. If the coding sequence is intended forexpression in a eukaryotic cell, a polyadenylation signal andtranscription termination sequence will usually be located 3′ to thecoding sequence.

[0070] “Open reading frame” is abbreviated ORF and means a length ofnucleic acid sequence, either DNA, cDNA or RNA, that comprises atranslation start signal or initiation codon, such as an ATG or AUG, anda termination codon and can be potentially translated into a polypeptidesequence.

[0071] The term “head-to-head” is used herein to describe theorientation of two polynucleotide sequences in relation to each other.Two polynucleotides are positioned in a head-to-head orientation whenthe 5′ end of the coding strand of one polynucleotide is adjacent to the5′ end of the coding strand of the other polynucleotide, whereby thedirection of transcription of each polynucleotide proceeds away from the5′ end of the other polynucteotide. The term “head-to-head” may beabbreviated (5′)-to(5′) and may also be indicated by the symbols (←→) or(3′←5′5′→3′).

[0072] The term “tall-to-tail” is used herein to describe theorientation of two polynucleotide sequences in relation to each other.Two polynucleotides are positioned in a tail-to-tail orientation whenthe 3′ end of the coding strand of one polynucleotide is adjacent to the3′ end of the coding strand of the other polynucleotide, whereby thedirection of transcription of each polynucleotide proceeds toward theother polynucleotide. The term “tail-to-tail” may be abbreviated(3′)-to-(3′) and may also be indicated by the symbols (→←) or(5′→3′3′←5′).

[0073] The term “head-to-tail” is used herein to describe theorientation of two polynucleotide sequences in relation to each other.Two polynucleotides are positioned in a head-to-tail orientation whenthe 5′ end of the coding strand of one polynucleotide is adjacent to the3′ end of the coding strand of the other polynucleotide, whereby thedirection of transcription of each polynucleotide proceeds in the samedirection as that of the other polynucleotide. The term “head-to-tail”may be abbreviated (5′)-to-(3′) and may also be indicated by the symbols(→→) or (5′→3′5′→3′).

[0074] The term “downstream” refers to a nucleotide sequence that islocated 3′ to reference nucleotide sequence. In particular, downstreamnucleotide sequences generally relate to sequences that follow thestarting point of transcription. For example, the translation initiationcodon of a gene is located downstream of the start site oftranscription.

[0075] The term “upstream” refers to a nucleotide sequence that islocated 5′ to reference nucleotide sequence. In particular, upstreamnucleotide sequences generally relate to sequences that are located onthe 5′ side of a coding sequence or starting point of transcription. Forexample, most promoters are located upstream of the start site oftranscription.

[0076] The terms “restriction endonuclease” and “restriction enzyme”refer to an enzyme that binds and cuts within a specific nucleotidesequence within double stranded DNA.

[0077] “Homologous recombination” refers to the insertion of a foreignDNA sequence into another DNA molecule, e.g., insertion of a vector in achromosome. Preferably, the vector targets a specific chromosomal sitefor homologous recombination. For specific homologous recombination, thevector will contain sufficiently long regions of homology to sequencesof the chromosome to allow complementary binding and incorporation ofthe vector into the chromosome. Longer regions of homology, and greaterdegrees of sequence similarity, may increase the efficiency ofhomologous recombination.

[0078] Several methods known in the art may be used to propagate apolynucleotide according to the invention. Once a suitable host systemand growth conditions are established, recombinant expression vectorscan be propagated and prepared in quantity. As described herein, theexpression vectors which can be used include, but are not limited to,the following vectors or their derivatives: human or animal viruses suchas vaccinia virus or adenovirus; insect viruses such as baculovirus;yeast vectors; bacteriophage vectors (e.g., lambda), and plasmid andcosmid DNA vectors, to name but a few.

[0079] A “vector” is any means for the cloning of and/or transfer of anucleic acid into a host cell. A vector may be a replicon to whichanother DNA segment may be attached so as to bring about the replicationof the attached segment. A “replicon” is any genetic element (e.g.,plasmid, phage, cosmid, chromosome, virus) that functions as anautonomous unit of DNA replication in vivo, i.e., capable of replicationunder its own control. The term “vector” includes both viral andnonviral means for introducing the nucleic acid into a cell in vitro, exvivo or in vivo. A large number of vectors known in the art may be usedto manipulate nucleic acids, incorporate response elements and promotersinto genes, etc. Possible vectors include, for example, plasmids ormodified viruses including, for example bacteriophages such as lambdaderivatives, or plasmids such as PBR322 or pUC plasmid derivatives, orthe Bluescript vector. For example, the insertion of the DNA fragmentscorresponding to response elements and promoters into a suitable vectorcan be accomplished by ligating the appropriate DNA fragments into achosen vector that has complementary cohesive termini. Alternatively,the ends of the DNA molecules may be enzymatically modified or any sitemay be produced by ligating nucleotide sequences (linkers) into the DNAtermini. Such vectors may be engineered to contain selectable markergenes that provide for the selection of cells that have incorporated themarker into the cellular genome. Such markers allow identificationand/or selection of host cells that incorporate and express the proteinsencoded by the marker.

[0080] Viral vectors, and particularly retroviral vectors, have beenused in a wide variety of gene delivery applications in cells, as wellas living animal subjects. Viral vectors that can be used include butare not limited to retrovirus, adeno-associated virus, pox, baculovirus,vaccinia, herpes simplex, Epstein-Barr, adenovirus, geminivirus, andcaulimovirus vectors. Non-viral vectors include plasmids, liposomes,electrically charged lipids (cytofectins), DNA-protein complexes, andbiopolymers. In addition to a nucleic acid, a vector may also compriseone or more regulatory regions, and/or selectable markers useful inselecting, measuring, and monitoring nucleic acid transfer results(transfer to which tissues, duration of expression, etc.).

[0081] The term “plasmid” refers to an extra chromosomal element oftencarrying a gene that is not part of the central metabolism of the cell,and usually in the form of circular double-stranded DNA molecules. Suchelements may be autonomously replicating sequences, genome integratingsequences, phage or nucleotide sequences, linear, circular, orsupercoiled, of a single- or double-stranded DNA or RNA, derived fromany source, in which a number of nucleotide sequences have been joinedor recombined into a unique construction which is capable of introducinga promoter fragment and DNA sequence for a selected gene product alongwith appropriate 3′ untranslated sequence into a cell.

[0082] A “cloning vector” is a “replicon”, which is a unit length of anucleic acid, preferably DNA, that replicates sequentially and whichcomprises an origin of replication, such as a plasmid, phage or cosmid,to which another nucleic acid segment may be attached so as to bringabout the replication of the attached segment. Cloning vectors may becapable of replication in one cell type and expression in another(“shuttle vector”).

[0083] Vectors may be introduced into the desired host cells by methodsknown in the art, e.g., transfection, electroporation, microinjection,transduction, cell fusion, DEAE dextran, calcium phosphateprecipitation, lipofection (lysosome fusion), use of a gene gun, or aDNA vector transporter (see, e.g., Wu et al., 1992, J. Biol. Chem. 267:963-967; Wu and Wu, 1988, J. Biol. Chem. 263: 14621-14624; and Hartmutet al., Canadian Patent Application No. 2,012,311, filed Mar. 15, 1990).

[0084] A polynucleotide according to the invention can also beintroduced iii vivo by lipofection. For the past decade, there has beenincreasing use of liposomes for encapsulation and transfection ofnucleic acids in vitro. Synthetic cationic lipids designed to limit thedifficulties and dangers encountered with liposome mediated transfectioncan be used to prepare liposomes for in vivo transfection of a geneencoding a marker (Felgner et al., 1987, Proc. Natl. Acad. Sci. U.S.A.84: 7413; Mackey, et al., 1988, Proc. Natl. Acad. Sci. U.S.A. 85:8027-8031; and Ulmer et al., 1993, Science 259: 1745-1748). The use ofcationic lipids may promote encapsulation of negatively charged nucleicacids, and also promote fusion with negatively charged cell membranes(Felgner and Ringold, 1989, Science 337: 387-388). Particularly usefullipid compounds and compositions for transfer of nucleic acids aredescribed in International Patent Publications WO95/18863 andWO96/17823, and in U.S. Pat. No. 5,459,127. The use of lipofection tointroduce exogenous genes into the specific organs in vivo has certainpractical advantages. Molecular targeting of liposomes to specific cellsrepresents one area of benefit. It is clear that directing transfectionto particular cell types would be particularly preferred in a tissuewith cellular heterogeneity, such as pancreas, liver, kidney, and thebrain. Lipids may be chemically coupled to other molecules for thepurpose of targeting (Mackey, et al., 1988, supra). Targeted peptides,e.g., hormones or neurotransmitters, and proteins such as antibodies, ornon-peptide molecules could be coupled to liposomes chemically.

[0085] Other molecules are also useful for facilitating transfection ofa nucleic acid in vivo, such as a cationic oligopeptide (e.g.,WO95/21931), peptides derived from DNA binding proteins (e.g.WO96/25508), or a cationic polymer (e.g., WO95/21931).

[0086] It is also possible to introduce a vector in vivo as a naked DNAplasmid (see U.S. Pat. Nos. 5,693,622, 5,589,466 and 5,580,859).Receptor-mediated DNA delivery approaches can also be used (Curiel etal., 1992, Hum. Gene Ther. 3: 147-154; and Wu and Wu, 1987, J. Biol.Chem. 262: 4429-4432).

[0087] The term “transfection” means the uptake of exogenous orheterologous RNA or DNA by a cell. A cell has been “transfected” byexogenous or heterologous RNA or DNA when such RNA or DNA has beenintroduced inside the cell. A cell has been “transformed” by exogenousor heterologous RNA or DNA when the transfected RNA or DNA effects aphenotypic change. The transforming RNA or DNA can be integrated(covalently linked) into chromosomal DNA making up the genome of thecell.

[0088] “Transformation” refers to the transfer of a nucleic acidfragment into the genome of a host organism, resulting in geneticallystable inheritance. Host organisms containing the transformed nucleicacid fragments are referred to as “transgenic” or “recombinant” or“transformed” organisms.

[0089] The term “genetic region” will refer to a region of a nucleicacid molecule or a nucleotide sequence that comprises a gene encoding apolypeptide.

[0090] In addition, the recombinant vector comprising a polynucleotideaccording to the invention may include one or more origins forreplication in the cellular hosts in which their amplification or theirexpression is sought, markers or selectable markers.

[0091] The term “selectable marker” means an identifying factor, usuallyan antibiotic or chemical resistance gene, that is able to be selectedfor based upon the marker gene's effect, i.e., resistance to anantibiotic, resistance to a herbicide, colorimetric markers, enzymes,fluorescent markers, and the like, wherein the effect is used to trackthe inheritance of a nucleic acid of interest and/or to identify a cellor organism that has inherited the nucleic acid of interest. Examples ofselectable marker genes known and used in the art include: genesproviding resistance to ampicillin, streptomycin, gentamycin, kanamycin,hygromycin, bialaphos herbicide, sulfonamide, and the like; and genesthat are used as phenotypic markers, i.e., anthocyanin regulatory genes,isopentanyl transferase gene, and the like.

[0092] The term “reporter gene” means a nucleic acid encoding anidentifying factor that is able to be identified based upon the reportergene's effect, wherein the effect is used to track the inheritance of anucleic acid of interest, to identify a cell or organism that hasinherited the nucleic acid of interest, and/or to measure geneexpression induction or transcription. Examples of reporter genes knownand used in the art include: luciferase (Luc), green fluorescent protein(GFP), chloramphenicol acetyltransferase (CAT), β-galactosidase (LacZ),β-glucuronidase (Gus), and the like. Selectable marker genes may also beconsidered reporter genes.

[0093] “Promoter” refers to a DNA sequence capable of controlling theexpression of a coding sequence or functional RNA. In general, a codingsequence is located 3′ to a promoter sequence. Promoters may be derivedin their entirety from a native gene, or be composed of differentelements derived from different promoters found in nature, or evencomprise synthetic DNA segments. It is understood by those skilled inthe art that different promoters may direct the expression of a gene indifferent tissues or cell types, or at different stages of development,or in response to different environmental or physiological conditions.Promoters that cause a gene to be expressed in most cell types at mosttimes are commonly referred to as “constitutive promoters”. Promotersthat cause a gene to be expressed in a specific cell type are commonlyreferred to as “cell-specific promoters” or “tissue-specific promoters”.Promoters that cause a gene to be expressed at a specific stage ofdevelopment or cell differentiation are commonly referred to as“developmentally-specific promoters” or “cell differentiation-specificpromoters”. Promoters that are induced and cause a gene to be expressedfollowing exposure or treatment of the cell with an agent, biologicalmolecule, chemical, ligand, light, or the like that induces the promoterare commonly referred to as “inducible promoters” or “regulatablepromoters”. It is further recognized that since in most cases the exactboundaries of regulatory sequences have not been completely defined, DNAfragments of different lengths may have identical promoter activity.

[0094] A “promoter sequence” is a DNA regulatory region capable ofbinding RNA polymerase in a cell and initiating transcription of adownstream (3′ direction) coding sequence. For purposes of defining thepresent invention, the promoter sequence is bounded at its 3′ terminusby the transcription initiation site and extends upstream (5′ direction)to include the minimum number of bases or elements necessary to initiatetranscription at levels detectable above background. Within the promotersequence will be found a transcription initiation site (convenientlydefined for example, by mapping with nuclease S1), as well as proteinbinding domains (consensus sequences) responsible for the binding of RNApolymerase.

[0095] A coding sequence is “under the control” of transcriptional andtranslational control sequences in a cell when RNA polymerasetranscribes the coding sequence into mRNA, which is then trans-RNAspliced (if the coding sequence contains introns) and translated intothe protein encoded by the coding sequence.

[0096] “Transcriptional and translational control sequences” are DNAregulatory sequences, such as promoters, enhancers, terminators, and thelike, that provide for the expression of a coding sequence in a hostcell. In eukaryotic cells, polyadenylation signals are controlsequences.

[0097] The term “response element” means one or more cis-acting DNAelements which confer responsiveness on a promoter mediated throughinteraction with the DNA-binding domains of the first chimeric gene.This DNA element may be either palindromic (perfect or imperfect) in itssequence or composed of sequence motifs or half sites separated by avariable number of nucleotides. The half sites can be similar oridentical and arranged as either direct or inverted repeats or as asingle half site or multimers of adjacent half sites in tandem. Theresponse element may comprise a minimal promoter isolated from differentorganisms depending upon the nature of the cell or organism into whichthe response element will be incorporated. The DNA binding domain of thefirst hybrid protein binds, in the presence or absence of a ligand, tothe DNA sequence of a response element to initiate or suppresstranscription of downstream gene(s) under the regulation of thisresponse element. Examples of DNA sequences for response elements of thenatural ecdysone receptor include: RRGG/TTCANTGAC/ACYY (see Cherbas L.,et. al., (1991), Genes Dev. 5, 120-131); AGGTCAN_((n))AGGTCA,whereN_((n)) can be one or more spacer nucleotides (see D'Avino P P., et.al., (1995), Mol. Cell. Endocrinol, 113, 1-9); and GGGTTGAATGAATTT (seeAntoniewski C., et. al., (1994), Mol. Cell Biol. 14, 4465-4474).

[0098] The term “operably linked” refers to the association of nucleicacid sequences on a single nucleic acid fragment so that the function ofone is affected by the other. For example, a promoter is operably linkedwith a coding sequence when it is capable of affecting the expression ofthat coding sequence (i.e., that the coding sequence is under thetranscriptional control of the promoter). Coding sequences can beoperably linked to regulatory sequences in sense or antisenseorientation.

[0099] The term “expression”, as used herein, refers to thetranscription and stable accumulation of sense (mRNA) or antisense RNAderived from a nucleic acid or polynucleotide. Expression may also referto translation of mRNA into a protein or polypeptide.

[0100] The terms “cassette”, “expression cassette” and “gene expressioncassette” refer to a segment of DNA that can be inserted into a nucleicacid or polynucleotide at specific restriction sites or by homologousrecombination. The segment of DNA comprises a polynucleotide thatencodes a polypeptide of interest, and the cassette and restrictionsites are designed to ensure insertion of the cassette in the properreading frame for transcription and translation. “Transformationcassette” refers to a specific vector comprising a polynucleotide thatencodes a polypeptide of interest and having elements in addition to thepolynucleotide that facilitate transformation of a particular host cell.Cassettes, expression cassettes, gene expression cassettes andtransformation cassettes of the invention may also comprise elementsthat allow for enhanced expression of a polynucleotide encoding apolypeptide of interest in a host cell. These elements may include, butare not limited to: a promoter, a minimal promoter, an enhancer, aresponse element, a terminator sequence, a polyadenylation sequence, andthe like.

[0101] For purposes of this invention, the term “gene switch” refers tothe combination of a response element associated with a promoter, and anEcR based system which, in the presence of one or more ligands,modulates the expression of a gene into which the response element andpromoter are incorporated.

[0102] The terms “modulate” and “modulates” mean to induce, reduce orinhibit nucleic acid or gene expression, resulting in the respectiveinduction, reduction or inhibition of protein or polypeptide production.

[0103] The plasmids or vectors according to the invention may furthercomprise at least one promoter suitable for driving expression of a genein a host cell. The term “expression vector” means a vector, plasmid orvehicle designed to enable the expression of an inserted nucleic acidsequence following transformation into the host. The cloned gene, i.e.,the inserted nucleic acid sequence, is usually placed under the controlof control elements such as a promoter, a minimal promoter, an enhancer,or the like. Initiation control regions or promoters, which are usefulto drive expression of a nucleic acid in the desired host cell arenumerous and familiar to those skilled in the art. Virtually anypromoter capable of driving these genes is suitable for the presentinvention including but not limited to: viral promoters, bacterialpromoters, animal promoters, mammalian promoters, synthetic promoters,constitutive promoters, tissue specific promoter, developmental specificpromoters, inducible promoters, light regulated promoters; CYC1, HIS3,GAL1, GAL4, GAL10, ADH1, PGK, PHO5, GAPDH, ADC1, TRP1, URA3, LEU2, ENO,TPI, alkaline phosphatase promoters (useful for expression inSaccharomyces); AOX1 promoter (useful for expression in Pichia);β-lactamase, lac, ara, tet, trp, lP_(L), lP_(R), T7, tac, and trcpromoters (useful for expression in Escherichia coli); lightregulated-promoters; animal and mammalian promoters known in the artinclude, but are not limited to, the SV40 early (SV40e) promoter region,the promoter contained in the 3′ long terminal repeat (LTR) of Roussarcoma virus (RSV), the promoters of the E1A or major late promoter(MLP) genes of adenoviruses (Ad), the cytomegalovirus (CMV) earlypromoter, the herpes simplex virus (HSV) thymidine kinase (TK) promoter,an elongation factor 1 alpha (EF1) promoter, a phosphoglycerate kinase(PGK) promoter, a ubiquitin (Ubc) promoter, an albumin promoter, theregulatory sequences of the mouse metallothionein-L promoter andtranscriptional control regions, the ubiquitous promoters (HPRT,vimentin, α-actin, tubulin and the like), the promoters of theintermediate filaments (desmin, neurofilaments, keratin, GFAP, and thelike), the promoters of therapeutic genes (of the MDR, CFTR or factorVIII type, and the like), pathogenesis or disease related-promoters, andpromoters that exhibit tissue specificity and have been utilized intransgenic animals, such as the elastase I gene control region which isactive in pancreatic acinar cells; insulin gene control region active inpancreatic beta cells, immunoglobulin gene control region active inlymphoid cells, mouse mammary tumor virus control region active intesticular, breast, lymphoid and mast cells; albumin gene, Apo AI andApo AII control regions active in liver, alpha-fetoprotein gene controlregion active in liver, alpha 1-antitrypsin gene control region activein the liver, beta-globin gene control region active in myeloid cells,myelin basic protein gene control region active in oligodendrocyte cellsin the brain, myosin light chain-2 gene control region active inskeletal muscle, and gonadotropic releasing hormone gene control regionactive in the hypothalamus, pyruvate kinase promoter, villin promoter,promoter of the fatty acid binding intestinal protein, promoter of thesmooth muscle cell α-actin, and the like. In addition, these expressionsequences may be modified by addition of enhancer or regulatorysequences and the like.

[0104] Enhancers that may be used in embodiments of the inventioninclude but are not limited to: an SV40 enhancer, a cytomegalovirus(CMV) enhancer, an elongation factor 1 (EF1) enhancer, yeast enhancers,viral gene enhancers, and the like.

[0105] Termination control regions, i.e., terminator or polyadenylationsequences, may also be derived from various genes native to thepreferred hosts. Optionally, a termination site may be unnecessary,however, it is most preferred if included. In a preferred embodiment ofthe invention, the termination control region may be comprise or bederived from a synthetic sequence, synthetic polyadenylation signal, anSV40 late polyadenylation signal, an SV40 polyadenylation signal, abovine growth hormone (BGH) polyadenylation signal, viral terminatorsequences, or the like.

[0106] The terms “3′ non-coding sequences” or “3′ untranslated region(UTR)” refer to DNA sequences located downstream (3′) of a codingsequence and may comprise polyadenylation [poly(A)] recognitionsequences and other sequences encoding regulatory signals capable ofaffecting mRNA processing or gene expression. The polyadenylation signalis usually characterized by affecting the addition of polyadenylic acidtracts to the 3′ end of the mRNA precursor.

[0107] “Regulatory region” means a nucleic acid sequence which regulatesthe expression of a second nucleic acid sequence. A regulatory regionmay include sequences which are naturally responsible for expressing aparticular nucleic acid (a homologous region) or may include sequencesof a different origin that are responsible for expressing differentproteins or even synthetic proteins (a heterologous region). Inparticular, the sequences can be sequences of prokaryotic, eukaryotic,or viral genes or derived sequences that stimulate or represstranscription of a gene in a specific or non-specific manner and in aninducible or non-inducible manner. Regulatory regions include origins ofreplication, RNA splice sites, promoters, enhancers, transcriptionaltermination sequences, and signal sequences which direct the polypeptideinto the secretory pathways of the target cell.

[0108] A regulatory region from a “heterologous source” is a regulatoryregion that is not naturally associated with the expressed nucleic acid.Included among the heterologous regulatory regions are regulatoryregions from a different species, regulatory regions from a differentgene, hybrid regulatory sequences, and regulatory sequences which do notoccur in nature, but which are designed by one having ordinary skill inthe art.

[0109] “RNA transcript” refers to the product resulting from RNApolymerase-catalyzed transcription of a DNA sequence. When the RNAtranscript is a perfect complementary copy of the DNA sequence, it isreferred to as the primary transcript or it may be a RNA sequencederived from post-transcriptional processing of the primary transcriptand is referred to as the mature RNA. “Messenger RNA (mRNA)” refers tothe RNA that is without introns and that can be translated into proteinby the cell. “cDNA” refers to a double-stranded DNA that iscomplementary to and derived from mRNA. “Sense” RNA refers to RNAtranscript that includes the mRNA and so can be translated into proteinby the cell. “Antisense RNA” refers to a RNA transcript that iscomplementary to all or part of a target primary transcript or mRNA andthat blocks the expression of a target gene. The complementarity of anantisense RNA may be with any part of the specific gene transcript,i.e., at the 5′ non-coding sequence, 3′ non-coding sequence, or thecoding sequence. “Functional RNA” refers to antisense RNA, ribozyme RNA,or other RNA that is not translated yet has an effect on cellularprocesses.

[0110] A “polypeptide” is a polymeric compound comprised of covalentlylinked amino acid residues. Amino acids have the following generalstructure:

[0111] Amino acids are classified into seven groups on the basis of theside chain R: (1) aliphatic side chains, (2) side chains containing ahydroxylic (OH) group, (3) side chains containing sulfur atoms, (4) sidechains containing an acidic or amide group, (5) side chains containing abasic group, (6) side chains containing an aromatic ring, and (7)proline, an imino acid in which the side chain is fused to the aminogroup. A polypeptide of the invention preferably comprises at leastabout 14 amino acids.

[0112] A “protein” is a polypeptide that performs a structural orfunctional role in a living cell.

[0113] An “isolated polypeptide” or “isolated protein” is a polypeptideor protein that is substantially free of those compounds that arenormally associated therewith in its natural state (e.g., other proteinsor polypeptides, nucleic acids, carbohydrates, lipids). “Isolated” isnot meant to exclude artificial or synthetic mixtures with othercompounds, or the presence of impurities which do not interfere withbiological activity, and which may be present, for example, due toincomplete purification, addition of stabilizers, or compounding into apharmaceutically acceptable preparation.

[0114] “Fragment” of a polypeptide according to the invention will beunderstood to mean a polypeptide whose amino acid sequence is shorterthan that of the reference polypeptide and which comprises, over theentire portion with these reference polypeptides, an identical aminoacid sequence. Such fragments may, where appropriate, be included in alarger polypeptide of which they are a part. Such fragments of apolypeptide according to the invention may have a length of at least 2,3, 4, 5, 6, 8, 10, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 25, 26, 30,35, 40, 45, 50, 100, 200, 240, or 300 amino acids.

[0115] A “variant” of a polypeptide or protein is any analogue,fragment, derivative, or mutant which is derived from a polypeptide orprotein and which retains at least one biological property of thepolypeptide or protein. Different variants of the polypeptide or proteinmay exist in nature. These variants may be allelic variationscharacterized by differences in the nucleotide sequences of thestructural gene coding for the protein, or may involve differentialsplicing or post-translational modification. The skilled artisan canproduce variants having single or multiple amino acid substitutions,deletions, additions, or replacements. These variants may include, interalia: (a) variants in which one or more amino acid residues aresubstituted with conservative or non-conservative amino acids, (b)variants in which one or more amino acids are added to the polypeptideor protein, (c) variants in which one or more of the amino acidsincludes a substituent group, and (d) variants in which the polypeptideor protein is fused with another polypeptide such as serum albumin. Thetechniques for obtaining these variants, including genetic(suppressions, deletions, mutations, etc.), chemical, and enzymatictechniques, are known to persons having ordinary skill in the art. Avariant polypeptide preferably comprises at least about 14 amino acids.

[0116] A “heterologous protein” refers to a protein not naturallyproduced in the cell.

[0117] A “mature protein” refers to a post-translationally processedpolypeptide; i.e., one from which any pre- or propeptides present in theprimary translation product have been removed. “Precursor” proteinrefers to the primary product of translation of mRNA; i.e., with pre-and propeptides still present. Pre- and propeptides may be but are notlimited to intracellular localization signals.

[0118] The term “signal peptide” refers to an amino terminal polypeptidepreceding the secreted mature protein. The signal peptide is cleavedfrom and is therefore not present in the mature protein. Signal peptideshave the function of directing and translocating secreted proteinsacross cell membranes. Signal peptide is also referred to as signalprotein.

[0119] A “signal sequence” is included at the beginning of the codingsequence of a protein to be expressed on the surface of a cell. Thissequence encodes a signal peptide, N-terminal to the mature polypeptide,that directs the host cell to translocate the polypeptide. The term“translocation signal sequence” is used herein to refer to this sort ofsignal sequence. Translocation signal sequences can be found associatedwith a variety of proteins native to eukaryotes and prokaryotes, and areoften functional in both types of organisms.

[0120] The term “homology” refers to the percent of identity between twopolynucleotide or two polypeptide moieties. The correspondence betweenthe sequence from one moiety to another can be determined by techniquesknown to the art. For example, homology can be determined by a directcomparison of the sequence information between two polypeptide moleculesby aligning the sequence information and using readily availablecomputer programs. Alternatively, homology can be determined byhybridization of polynucleotides under conditions that form stableduplexes between homologous regions, followed by digestion withsingle-stranded-specific nuclease(s) and size determination of thedigested fragments.

[0121] As used herein, the term “homologous” in all its grammaticalforms and spelling variations refers to the relationship betweenproteins that possess a “common evolutionary origin,” including proteinsfrom superfamilies (e.g., the immunoglobulin superfamily) and homologousproteins from different species (e.g., myosin light chain, etc.) (Reecket al., 1987, Cell 50:667.). Such proteins (and their encoding genes)have sequence homology, as reflected by their high degree of sequencesimilarity. However, in common usage and in the instant application, theterm “homologous,” when modified with an adverb such as “highly,” mayrefer to sequence similarity and not a common evolutionary origin.

[0122] Accordingly, the term “sequence similarity” in all itsgrammatical forms refers to the degree of identity or correspondencebetween nucleic acid or amino acid sequences of proteins that may or maynot share a common evolutionary origin (see Reeck et al., 1987, Cell50:667).

[0123] In a specific embodiment, two DNA sequences are “substantiallyhomologous” or “substantially similar” when at least about 50%(preferably at least about 75%, and most preferably at least about 90 or95%) of the nucleotides match over the defined length of the DNAsequences. Sequences that are substantially homologous can be identifiedby comparing the sequences using standard software available in sequencedata banks, or in a Southern hybridization experiment under, forexample, stringent conditions as defined for that particular system.Defining appropriate hybridization conditions is within the skill of theart. See, e.g., Sambrook et al., 1989, supra.

[0124] As used herein, “substantially similar” refers to nucleic acidfragments wherein changes in one or more nucleotide bases results insubstitution of one or more amino acids, but do not affect thefunctional properties of the protein encoded by the DNA sequence.“Substantially similar” also refers to nucleic acid fragments whereinchanges in one or more nucleotide bases does not affect the ability ofthe nucleic acid fragment to mediate alteration of gene expression byantisense or co-suppression technology. “Substantially similar” alsorefers to modifications of the nucleic acid fragments of the instantinvention such as deletion or insertion of one or more nucleotide basesthat do not substantially affect the functional properties of theresulting transcript. It is therefore understood that the inventionencompasses more than the specific exemplary sequences. Each of theproposed modifications is well within the routine skill in the art, asis determination of retention of biological activity of the encodedproducts.

[0125] Moreover, the skilled artisan recognizes that substantiallysimilar sequences encompassed by this invention are also defined bytheir ability to hybridize, under stringent conditions (0.1×SSC, 0.1%SDS, 65° C. and washed with 2×SSC, 0.1% SDS followed by 0.1×SSC, 0.1%SDS), with the sequences exemplified herein. Substantially similarnucleic acid fragments of the instant invention are those nucleic acidfragments whose DNA sequences are at least 70% identical to the DNAsequence of the nucleic acid fragments reported herein. Preferredsubstantially nucleic acid fragments of the instant invention are thosenucleic acid fragments whose DNA sequences are at least 80% identical tothe DNA sequence of the nucleic acid fragments reported herein. Morepreferred nucleic acid fragments are at least 90% identical to the DNAsequence of the nucleic acid fragments reported herein. Even morepreferred are nucleic acid fragments that are at least 95% identical tothe DNA sequence of the nucleic acid fragments reported herein.

[0126] Two amino acid sequences are “substantially homologous” or“substantially similar” when greater than about 40% of the amino acidsare identical, or greater than 60% are similar (functionally identical).Preferably, the similar or homologous sequences are identified byalignment using, for example, the GCG (Genetics Computer Group, ProgramManual for the GCG Package, Version 7, Madison, Wis.) pileup program.

[0127] The term “corresponding to” is used herein to refer to similar orhomologous sequences, whether the exact position is identical ordifferent from the molecule to which the similarity or homology ismeasured. A nucleic acid or amino acid sequence alignment may includespaces. Thus, the term “corresponding to” refers to the sequencesimilarity, and not the numbering of the amino acid residues ornucleotide bases.

[0128] A “substantial portion” of an amino acid or nucleotide sequencecomprises enough of the amino acid sequence of a polypeptide or thenucleotide sequence of a gene to putatively identify that polypeptide orgene, either by manual evaluation of the sequence by one skilled in theart, or by computer-automated sequence comparison and identificationusing algorithms such as BLAST (Basic Local Alignment Search Tool;Altschul, S. F., et al., (1993) J. Mol. Biol. 215: 403-410; see alsowww.ncbi.nlm.nih.gov/BLAST/). In general, a sequence of ten or morecontiguous amino acids or thirty or more nucleotides is necessary inorder to putatively identify a polypeptide or nucleic acid sequence ashomologous to a known protein or gene. Moreover, with respect tonucleotide sequences, gene specific oligonucleotide probes comprising20-30 contiguous nucleotides may be used in sequence-dependent methodsof gene identification (e.g., Southern hybridization) and isolation(e.g., in situ hybridization of bacterial colonies or bacteriophageplaques). In addition, short oligonucleotides of 12-15 bases may be usedas amplification primers in PCR in order to obtain a particular nucleicacid fragment comprising the primers. Accordingly, a “substantialportion” of a nucleotide sequence comprises enough of the sequence tospecifically identify and/or isolate a nucleic acid fragment comprisingthe sequence.

[0129] The term “percent identity”, as known in the art, is arelationship between two or more polypeptide sequences or two or morepolynucleotide sequences, as determined by comparing the sequences. Inthe art, “identity” also means the degree of sequence relatednessbetween polypeptide or polynucleotide sequences, as the case may be, asdetermined by the match between strings of such sequences. “Identity”and “similarity” can be readily calculated by known methods, includingbut not limited to those described in: Computational Molecular Biology(Lesk, A. M., ed.) Oxford University Press, New York (1988);Biocomputing: Informatics and Genome Projects (Smith, D. W., ed.)Academic Press, New York (1993); Computer Analysis of Sequence Data,Part I (Griffin, A. M., and Griffin, H. G., eds.) Humana Press, NewJersey (1994); Sequence Analysis in Molecular Biology (von Heinje, G.,ed.) Academic Press (1987); and Sequence Analysis Primer (Gribskov, M.and Devereux, J., eds.) Stockton Press, New York (1991). Preferredmethods to determine identity are designed to give the best matchbetween the sequences tested. Methods to determine identity andsimilarity are codified in publicly available computer programs.Sequence alignments and percent identity calculations may be performedusing the Megalign program of the LASERGENE bioinformatics computingsuite (DNASTAR Inc., Madison, Wis. Multiple alignment of the sequencesmay be performed using the Clustal method of alignment (Higgins andSharp (1989) CABIOS. 5: 151-153) with the default parameters (GAPPENALTY=10, GAP LENGTH PENALTY=10). Default parameters for pairwisealignments using the Clustal method may be selected: KTUPLE 1, GAPPENALTY=3, WINDOW=5 and DIAGONALS SAVED=5.

[0130] The term “sequence analysis software” refers to any computeralgorithm or software program that is useful for the analysis ofnucleotide or amino acid sequences. “Sequence analysis software” may becommercially available or independently developed. Typical sequenceanalysis software will include but is not limited to the GCG suite ofprograms (Wisconsin Package Version 9.0, Genetics Computer Group (GCG),Madison, Wis.), BLASTP, BLASTN, BLASTX (Altschul et al., J. Mol. Biol.215:403-410 (1990), and DNASTAR (DNASTAR, Inc. 1228 S. Park St. Madison,Wis. 53715 USA). Within the context of this application it will beunderstood that where sequence analysis software is used for analysis,that the results of the analysis will be based on the “default values”of the program referenced, unless otherwise specified. As used herein“default values” will mean any set of values or parameters whichoriginally load with the software when first initialized.

[0131] “Synthetic genes” can be assembled from oligonucleotide buildingblocks that are chemically synthesized using procedures known to thoseskilled in the art. These building blocks are ligated and annealed toform gene segments that are then enzymatically assembled to constructthe entire gene. “Chemically synthesized”, as related to a sequence ofDNA, means that the component nucleotides were assembled in vitro.Manual chemical synthesis of DNA may be accomplished usingwell-established procedures, or automated chemical synthesis can beperformed using one of a number of commercially available machines.Accordingly, the genes can be tailored for optimal gene expression basedon optimization of nucleotide sequence to reflect the codon bias of thehost cell. The skilled artisan appreciates the likelihood of successfulgene expression if codon usage is biased towards those codons favored bythe host. Determination of preferred codons can be based on a survey ofgenes derived from the host cell where sequence information isavailable.

GENE EXPRESSION MODULATION SYSTEM OF THE INVENTION

[0132] Applicants have previously shown that separating thetransactivation and DNA binding domains by placing them on two differentproteins results in greatly reduced background activity in the absenceof a ligand and significantly increased activity over background in thepresence of a ligand (pending application PCT/US01/09050). Thistwo-hybrid system is a significantly improved inducible gene expressionmodulation system compared to the two systems disclosed in InternationalPatent Applications PCT/US97/05330 and PCT/US98/14215. The two-hybridsystem exploits the ability of a pair of interacting proteins to bringthe transcription activation domain into a more favorable positionrelative to the DNA binding domain such that when the DNA binding domainbinds to the DNA binding site on the gene, the transactivation domainmore effectively activates the promoter (see, for example, U.S. Pat. No.5,283,173). Briefly, the two-hybrid gene expression system comprises twogene expression cassettes; the first encoding a DNA binding domain fusedto a nuclear receptor polypeptide, and the second encoding atransactivation domain fused to a different nuclear receptorpolypeptide. In the presence of ligand, the interaction of the firstpolypeptide with the second polypeptide effectively tethers the DNAbinding domain to the transactivation domain. Since the DNA binding andtransactivation domains reside on two different molecules, thebackground activity in the absence of ligand is greatly reduced.

[0133] The two-hybrid ecdysone receptor-based gene expression modulationsystem may be either heterodimeric and homodimeric. A functional EcRcomplex generally refers to a heterodimeric protein complex consistingof two members of the steroid receptor family, an ecdysone receptorprotein obtained from various insects, and an ultraspiracle (USP)protein or the vertebrate homolog of USP, retinoid X receptor protein(see Yao, et al. (1993) Nature 366, 476-479; Yao, et al., (1992) Cell71, 63-72). However, the complex may also be a homodimer as detailedbelow. The functional ecdysteroid receptor complex may also includeadditional protein(s) such as immunophilins. Additional members of thesteroid receptor family of proteins, known as transcriptional factors(such as DHR38 or betaFTZ-1), may also be ligand dependent orindependent partners for EcR, USP, and/or RXR. Additionally, othercofactors may be required such as proteins generally known ascoactivators (also termed adapters or mediators). These proteins do notbind sequence-specifically to DNA and are not involved in basaltranscription. They may exert their effect on transcription activationthrough various mechanisms, including stimulation of DNA-binding ofactivators, by affecting chromatin structure, or by mediatingactivator-initiation complex interactions. Examples of such coactivatorsinclude RIP140, TIF1, RAP46/Bag-1, ARA70, SRC-1/NCoA-1,TIF2/GRIP/NCoA-2, ACTR/AIB1/RAC3/pCIP as well as the promiscuouscoactivator C response element B binding protein, CBP/p300 (for reviewsee Glass et al., Curr. Opin. Cell Biol. 9: 222-232, 1997). Also,protein cofactors generally known as corepressors (also known asrepressors, silencers, or silencing mediators) may be required toeffectively inhibit 2 0 transcriptional activation in the absence ofligand. These corepressors may interact with the unliganded ecdysonereceptor to silence the activity at the response element. Currentevidence suggests that the binding of ligand changes the conformation ofthe receptor, which results in release of the corepressor andrecruitment of the above described coactivators, thereby abolishingtheir silencing activity. Examples of corepressors include N-CoR andSMRT (for review, see Horwitz et al. Mol Endocrinol. 10: 1167-1177,1996). These cofactors may either be endogenous within the cell ororganism, or may be added exogenously as transgenes to be expressed ineither a regulated or unregulated fashion. Homodimer complexes of theecdysone receptor protein, USP, or RXR may also be functional under somecircumstances.

[0134] The ecdysone receptor complex typically includes proteins whichare members of the nuclear receptor superfamily wherein all members aregenerally characterized by the presence of an amino-terminaltransactivation domain, a DNA binding domain (“DBD”), and a ligandbinding domain (“LBD”) separated from the DBD by a hinge region. As usedherein, the term “DNA binding domain” comprises a minimal polypeptidesequence of a DNA binding protein, up to the entire length of a DNAbinding protein, so long as the DNA binding domain functions toassociate with a particular response element. Members of the nuclear,receptor superfamily are also characterized by the presence of four orfive domains: A/B, C, D, E, and in some members F (see U.S. Pat. No.4,981,784 and Evans, Science 240:889-895 (1988)). The “A/B” domaincorresponds to the transactivation domain, “C” corresponds to the DNAbinding domain, “D” corresponds to the hinge region, and “E” correspondsto the ligand binding domain. Some members of the family may also haveanother transactivation domain on the carboxy-terminal side of the LBDcorresponding to “F”.

[0135] The DBD is characterized by the presence of two cysteine zincfingers between which are two amino acid motifs, the P-box and theD-box, which confer specificity for ecdysone response elements. Thesedomains may be either native, modified, or chimeras of different domainsof heterologous receptor proteins. This EcR receptor, like a subset ofthe steroid receptor family, also possesses less well-defined regionsresponsible for heterodimerization properties. Because the domains ofEcR, USP, and RXR are modular in nature, the LBD, DBD, andtransactivation domains may be interchanged.

[0136] Gene switch systems are known that incorporate components fromthe ecdysone receptor complex. However, in these known systems, wheneverEcR is used it is associated with native or modified DNA binding domainsand transactivation domains on the same molecule. USP or RXR aretypically used as silent partners. Applicants have previously shown thatwhen DNA binding domains and transactivation domains are on the samemolecule the background activity in the absence of ligand is high andthat such activity is dramatically reduced when DNA binding domains andtransactivation domains are on different molecules, that is, on each oftwo partners of a heterodimeric or homodimeric complex (seePCT/US01/09050). This two-hybrid system also provides improvedsensitivity to non-steroidal ligands for example, diacylhydrazines, whencompared to steroidal ligands for example, ponasterone A (“PonA”) ormuristerone A (“MurA”). That is, when compared to steroids, thenon-steroidal ligands provide higher activity at a lower concentration.In addition, since transactivation based on EcR gene switches is oftencell-line dependent, it is easier to tailor switching systems to obtainmaximum transactivation capability for each application. Furthermore,the two-hybrid system avoids some side effects due to overexpression ofRXR that often occur when unmodified RXR is used as a switching partner.In a specific embodiment of the two-hybrid system, native DNA bindingand transactivation domains of EcR or RXR are eliminated and as aresult, these chimeric molecules have less chance of interacting withother steroid hormone receptors present in the cell resulting in reducedside effects.

[0137] Applicants have previously shown that an ecdysone receptor inpartnership with a dipteran (fruit fly Drosophila melanogaster) or alepidopteran (spruce bud worm Choristoneura fumiferana) ultraspiracleprotein (USP) is constitutively expressed in mammalian cells, while anecdysone receptor in partnership with a vertebrate retinoid X receptor(RXR) is inducible in mammalian cells (pending applicationPCT/US01/09050). Recently, Applicants made the surprising discovery thatthe ultraspiracle protein of Locusta migratoria (“LmUSP”) and the RXRhomolog 1 and RXR homolog 2 of the ixodid tick Amblyomma americanum(“AmaRXR1” and “AmaRXR2”, respectively) and their non-Dipteran,non-Lepidopteran homologs including, but not limited to: fiddler crabCeluca pugilator RXR homolog (“CpRXR”), beetle Tenebrio molitor RXRhomolog (“TmRXR”), honeybee Apis mellifera RXR homolog (“AmRXR”), and anaphid Myzus persicae RXR homolog (“MpRXR”), all of which are referred toherein collectively as invertebrate RXRs, can function similar tovertebrate retinoid X receptor (RXR) in an inducible ecdysonereceptor-based inducible gene expression system in mammalian cells (USapplication filed herewith, incorporated by reference herein, in itsentirety).

[0138] As described herein, Applicants have now discovered that achimeric RXR ligand binding domain comprising at least two polypeptidefragments, wherein the first polypeptide fragment is from one species ofvertebrate/invertebrate RXR and the second polypeptide fragment is froma different species of vertebrate/invertebrate RXR, whereby avertebrate/invertebrate chimeric RXR ligand binding domain, avertebrate/vertebrate chimeric RXR ligand binding domain, or aninvertebrate/invertebrate chimeric RXR ligand binding domain isproduced, can function in an ecdysone receptor-based inducible geneexpression system. Surprisingly, Applicants' novel EcR/chimericRXR-based inducible gene expression system can function similar to orbetter than both the EcR/vertebrate RXR-based gene expression system(PCT/US01/09050) and the EcR/invertebrate RXR-based gene expressionsystem (US application filed herewith) in terms of ligand sensitivityand magnitude of gene induction. Thus, the present invention provides animproved EcR-based inducible gene expression system for use inbacterial, fungal, yeast, animal, and mammalian cells.

[0139] In particular, Applicants describe herein a novel two-hybridsystem that comprises a chimeric RXR ligand binding domain. This novelgene expression system demonstrates for the first time that apolypeptide comprising a chimeric RXR ligand binding domain can functionas a component of an inducible EcR-based inducible gene expressionsystem in yeast and mammalian cells. As discussed herein, this findingis both unexpected and surprising.

[0140] Specifically, Applicants' invention relates to a gene expressionmodulation system comprising: a) a first gene expression cassette thatis capable of being expressed in a host cell, wherein the first geneexpression cassette comprises a polynucleotide that encodes a firsthybrid polypeptide comprising i) a DNA-binding domain that recognizes aresponse element associated with a gene whose expression is to bemodulated; and ii) an ecdysone receptor ligand binding domain; and b) asecond gene expression cassette that is capable of being expressed inthe host cell, wherein the second gene expression cassette comprises apolynucleotide sequence that encodes a second hybrid polypeptidecomprising i) a transactivation domain; and ii) a chimeric retinoid Xreceptor ligand binding domain.

[0141] The present invention also relates to a gene expressionmodulation system comprising: a) a first gene expression cassette thatis capable of being expressed in a host cell, wherein the first geneexpression cassette comprises a polynucleotide that encodes a firsthybrid polypeptide comprising i) a DNA-binding domain that recognizes aresponse element associated with a gene whose expression is to bemodulated; and ii) a chimeric retinoid X receptor ligand binding domain;and b) a second gene expression cassette that is capable of beingexpressed in the host cell, wherein the second gene expression cassettecomprises a polynucleotide sequence that encodes a second hybridpolypeptide comprising i) a transactivation domain; and ii) an ecdysonereceptor ligand binding domain.

[0142] The present invention also relates to a gene expressionmodulation system according to the present invention further comprisingc) a third gene expression cassette comprising: i) a response element towhich the DNA-binding domain of the first hybrid polypeptide binds; ii)a promoter that is activated by the transactivation domain of the secondhybrid polypeptide; and iii) a gene whose expression is to be modulated.

[0143] In a specific embodiment, the gene whose expression is to bemodulated is a homologous gene with respect to the host cell. In anotherspecific embodiment, the gene whose expression is to be modulated is aheterologous gene with respect to the host cell.

[0144] The ligands for use in the present invention as described below,when combined with an EcR ligand binding domain and a chimeric RXRligand binding domain, which in turn are bound to the response elementlinked to a gene, provide the means for external temporal regulation ofexpression of the gene. The binding mechanism or the order in which thevarious components of this invention bind to each other, that is, forexample, ligand to receptor, first hybrid polypeptide to responseelement, second hybrid polypeptide to promoter, etc., is not critical.Binding of the ligand to the EcR ligand binding domain and the chimericRXR ligand binding domain enables expression or suppression of the gene.This mechanism does not exclude the potential for ligand binding to EcRor chimeric RXR, and the resulting formation of active homodimercomplexes (e.g. EcR+EcR or chimeric RXR+chimeric RXR). Preferably, oneor more of the receptor domains is varied producing a hybrid geneswitch. Typically, one or more of the three domains, DBD, LBD, andtransactivation domain, may be chosen from a source different than thesource of the other domains so that the hybrid genes and the resultinghybrid proteins are optimized in the chosen host cell or organism fortransactivating activity, complementary binding of the ligand, andrecognition of a specific response element. In addition, the responseelement itself can be modified or substituted with response elements forother DNA binding protein domains such as the GAL-4 protein from yeast(see Sadowski, et al. (1988), Nature 335: 563-564) or LexA protein fromEscherichia coli (see Brent and Ptashne (1985), Cell 43: 729-736), orsynthetic response elements specific for targeted interactions withproteins designed, modified, and selected for such specific interactions(see, for example, Kim, et al. (1997), Proc. Natl. Acad. Sci., USA 94:3616-3620) to accommodate hybrid receptors. Another advantage oftwo-hybrid systems is that they allow choice of a promoter used to drivethe gene expression according to a desired end result. Such doublecontrol can be particularly important in areas of gene therapy,especially when cytotoxic proteins are produced, because both the timingof expression as well as the cells wherein expression occurs can becontrolled. When genes, operably linked to a suitable promoter, areintroduced into the cells of the subject, expression of the exogenousgenes is controlled by the presence of the system of this invention.Promoters may be constitutively or inducibly regulated or may betissue-specific (that is, expressed only in a particular type of cells)or specific to certain developmental stages of the organism.

GENE EXPRESSION CASSETTES OF THE INVENTION

[0145] The novel EcR/chimeric RXR-based inducible gene expression systemof the invention comprises gene expression cassettes that are capable ofbeing expressed in a host cell, wherein the gene expression cassetteseach comprise a polynucleotide encoding a hybrid polypeptide.. Thus,Applicants' invention also provides gene expression cassettes for use inthe gene expression system of the invention.

[0146] Specifically, the present invention provides a gene expressioncassette comprising a polynucleotide encoding a hybrid polypeptide. Inparticular, the present invention provides a gene expression cassettethat is capable of being expressed in a host cell, wherein the geneexpression cassette comprises a polynucleotide that encodes a hybridpolypeptide comprising either i) a DNA-binding domain that recognizes aresponse element, or ii) a transactivation domain; and an ecdysonereceptor ligand binding domain or a chimeric retinoid X receptor ligandbinding domain.

[0147] In a specific embodiment, the gene expression cassette encodes ahybrid polypeptide comprising a DNA-binding domain that recognizes aresponse element and an EcR ligand binding domain.

[0148] In another specific embodiment, the gene expression cassetteencodes a hybrid polypeptide comprising a DNA-binding domain thatrecognizes a response element and a chimeric RXR ligand binding domain.

[0149] In another specific embodiment, the gene expression cassetteencodes a hybrid polypeptide comprising a transactivation domain and anEcR ligand binding domain.

[0150] In another specific embodiment, the gene expression cassetteencodes a hybrid polypeptide comprising a transactivation domain and achimeric RXR ligand binding domain.

[0151] In a preferred embodiment, the ligand binding domain (LBD) is anEcR LBD, a chimeric RXR LBD, or a related steroid/thyroid hormonenuclear receptor family member LBD or chimeric LBD, analog, combination,or modification thereof. In a specific embodiment, the LBD is an EcR LBDor a chimeric RXR LBD. In another specific embodiment, the LBD is from atruncated EcR LBD or a truncated chimeric RXR LBD. A truncation mutationmay be made by any method used in the art, including but not limited torestriction endonuclease digestion/deletion,PCR-mediated/oligonucleotide-directed deletion, chemical mutagenesis,DNA strand breakage, and the like.

[0152] The EcR may be an invertebrate EcR, preferably selected from theclass Arthropod. Preferably, the EcR is selected from the groupconsisting of a Lepidopteran EcR, a Dipteran EcR, an Orthopteran EcR, aHomopteran EcR and a Hemipteran EcR. More preferably, the EcR for use isa spruce budworm Choristoneura fumiferana EcR (“CfEcR”), a beetleTenebrio molitor EcR (“TmEcR”), a Manduca sexta EcR (“MsEcR”), aHeliothies virescens EcR (“HvEcR”), a midge Chironomus tentans EcR(“CtEcR”), a silk moth Bombyx mori EcR (“CfEcR”), a fruit fly Drosophilamelanogaster EcR (“DmEcR”), a mosquito Aedes aegypti EcR (“AaEcR”), ablowfly Lucilia capitata EcR (“LcEcR”), a blowfly Lucilia cuprina EcR(“LucEcR”), a Mediterranean fruit fly Ceratitis capitataEcR (“CcEcR”), alocust Locusta migratoria EcR (“LucEcR”), an aphid Myzus persicae EcR(“MpEcR”), a fiddler crab Celuca pugilator EcR (“CpEcR”), an ixodid tickAmblyomma americanum EcR (“AmaEcR”), a whitefly Bamecia argentifoli EcR(“BaEcR”, SEQ ID NO: 68) or a leafhopper Nephotetix cincticeps EcR(“NcEcR”, SEQ ID NO: 69). In a specific embodiment, the LBD is fromspruce budworm (Choristoneura fumiferana) EcR (“CfEcR”) or fruit flyDrosophila melanogaster EcR (“DmEcR”).

[0153] In a specific embodiment, the EcR LBD comprises full-length EFdomains. In a preferred embodiment, the-full length EF domains areencoded by a polynucleotide comprising a nucleic acid sequence of SEQ IDNO: 1 or SEQ ID NO: 2.

[0154] In a specific embodiment, the LBD is from a truncated EcR LBD.The EcR LBD truncation results in a deletion of at least 1, 2, 3, 4, 5,10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95,100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165,170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, or240 amino acids. In another specific embodiment, the EcR LBD truncationresult in a deletion of at least a partial polypeptide domain. Inanother specific embodiment, the EcR LBD truncation results in adeletion of at least an entire polypeptide domain. More preferably, theEcR polypeptide truncation results in a deletion of at least anA/B-domain, a C-domain, a D-domain, an F-domain, an A/B/C-domains, anA/B/l/2-C-domains, an A/B/C/D-domains, an A/B/C/D/F-domains, anA/B/F-domains, an A/B/C/F-domains, a partial E-domain, or a partialF-domain. A combination of several partial and/or complete domaindeletions may also be performed.

[0155] In one embodiment, the ecdysone receptor ligand binding domain isencoded by a polynucleotide comprising a nucleic acid sequence selectedfrom the group consisting of SEQ ID NO: 1 (CfEcR-EF), SEQ ID NO: 2(DmEcR-EF), SEQ ID NO: 3 (CfEcR-DE), and SEQ ID NO: 4 (DmEcR-DE).

[0156] In a preferred embodiment, the ecdysone receptor ligand bindingdomain is encoded by a polynucleotide comprising a nucleic acid sequenceselected from the group consisting of SEQ ID NO: 65 (CfEcR-DEF), SEQ IDNO: 59 (CfEcR-CDEF), SEQ ID NO: 67 (DmEcR-DEF), SEQ ID NO: 71(TmEcR-DEF) and SEQ ID NO: 73 (AmaEcR-DEF).

[0157] In one embodiment, the ecdysone receptor ligand binding domaincomprises an amino acid sequence selected from the group consisting ofSEQ ID NO: 5 (CfEcR-EF), SEQ ID NO: 6 (DmEcR-EF), SEQ ID NO: 7(CfEcR-DE), and SEQ ID NO: 8 (DmEcR-DE).

[0158] In a preferred embodiment, the ecdysone receptor ligand bindingdomain comprises an amino acid sequence selected from the groupconsisting of SEQ ID NO: 57 (CfEcR-DEF), SEQ ID NO: 70 (CfEcR-CDEF), SEQID NO: 58 (DmEcR-DEF), SEQ ID NO: 72 (TmEcR-DEF), and SEQ ID NO: 74(AmaEcR-DEF).

[0159] Preferably, the chimeric RXR ligand binding domain comprises atleast two polypeptide fragments selected from the group consisting of avertebrate species RXR polypeptide fragment, an invertebrate species RXRpolypeptide fragment, and a non-Dipteran/non-Lepidopteran invertebratespecies RXR homolog polypeptide fragment. A chimeric RXR ligand bindingdomain according to the invention may comprise at least two differentspecies RXR polypeptide fragments, or when the species is the same, thetwo or more polypeptide fragments may be from two or more differentisoforms of the species RXR polypeptide fragment.

[0160] In a specific embodiment, the vertebrate species RXR polypeptidefragment is from a mouse Mus musculus RXR (“MmRXR”) or a human Homosapiens RXR (“HsRXR”). The RXR polypeptide may be an RXR_(α), RXR_(β),or RXR_(γ) isoform.

[0161] In a preferred embodiment, the vertebrate species RXR polypeptidefragment is from a vertebrate species RXR-EF domain encoded by apolynucleotide comprising a nucleic acid sequence selected from thegroup consisting of SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ IDNO: 12, SEQ ID NO: 13, and SEQ ID NO: 14. In another preferredembodiment, the vertebrate species RXR polypeptide fragment is from avertebrate species RXR-EF domain comprising an amino acid sequenceselected from the group consisting of SEQ ID NO: 15, SEQ ID NO: 16, SEQID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, and SEQ ID NO: 20.

[0162] In another specific embodiment, the invertebrate species RXRpolypeptide fragment is from a locust Locusta migratoria ultraspiraclepolypeptide (“LmUSP”), an ixodid tick Amblyomma americanum RXR homolog 1(“AmaRXR1”), a ixodid tick Amblyomma americanum RXR homolog 2(“AmaRXR2”), a fiddler crab Celuca pugilator RXR homolog (“CpRXR”), abeetle Tenebrio molitor RXR homolog (“TmRXR”), a honeybee Apis melliferaRXR homolog (“AmRXR”), and an aphid Myzus persicae RXR homolog(“MpRXR”).

[0163] In a preferred embodiment, the invertebrate species RXRpolypeptide fragment is from a invertebrate species RXR-EF domainencoded by a polynucleotide comprising a nucleic acid sequence selectedfrom the group consisting of SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO:23, SEQ ID NO: 24, SEQ ID NO: 25, and SEQ ID NO: 26. In anotherpreferred embodiment, the invertebrate species RXR polypeptide fragmentis from a invertebrate species RXR-EF domain comprising an amino acidsequence selected from the group consisting of SEQ ID NO: 27, SEQ ID NO:28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, and SEQ ID NO: 32.

[0164] In another specific embodiment, the invertebrate species RXRpolypeptide fragment is from a non-Dipteran/non-Lepidopteraninvertebrate species RXR homolog.

[0165] In a preferred embodiment, the chimeric RXR ligand binding domaincomprises at least one vertebrate species RXR polypeptide fragment andone invertebrate species RXR polypeptide fragment.

[0166] In another preferred embodiment, the chimeric RXR ligand bindingdomain comprises at least one vertebrate species RXR polypeptidefragment and one non-Dipteran/non-Lepidopteran invertebrate species RXRhomolog polypeptide fragment.

[0167] In another preferred embodiment, the chimeric RXR ligand bindingdomain comprises at least one invertebrate species RXR polypeptidefragment and one non-Dipteran/non-Lepidopteran invertebrate species RXRhomolog polypeptide fragment.

[0168] In another preferred embodiment, the chimeric RXR ligand bindingdomain comprises at least one vertebrate species RXR polypeptidefragment and one different vertebrate species RXR polypeptide fragment.

[0169] In another preferred embodiment, the chimeric RXR ligand bindingdomain comprises at least one invertebrate species RXR polypeptidefragment and one different invertebrate species RXR polypeptidefragment.

[0170] In another preferred embodiment, the chimeric RXR ligand bindingdomain comprises at least one non-Dipteran/non-Lepidopteran invertebratespecies RXR polypeptide fragment and one differentnon-Dipteran/non-Lepidopteran invertebrate species RXR polypeptidefragment,

[0171] In a specific embodiment, the chimeric RXR LBD comprises an RXRLBD domain comprising at least one polypeptide fragment selected fromthe group consisting of an EF-domain helix 1, an EF-domain helix 2, anEF-domain helix 3, an EF-domain helix 4, an EF-domain helix 5, anEF-domain helix 6, an EF-domain helix 7, an EF-domain helix 8, andEF-domain helix 9, an EF-domain helix 10, an EF-domain helix 11, anEF-domain helix 12, an F-domain, and an EF-domain β-pleated sheet,wherein the polypeptide fragment is from a different species RXR, i.e.,chimeric to the RXR LBD domain, than the RXR LBD domain.

[0172] In another specific embodiment, the first polypeptide fragment ofthe chimeric RXR ligand binding domain comprises helices 1-6, helices1-7, helices 1-8, helices 1-9, helices 1-10, helices 1-11, or helices1-12 of a first species RXR according to the invention, and the secondpolypeptide fragment of the chimeric RXR ligand binding domain compriseshelices 7-12, helices 8-12, helices 9-12, helices 10-12, helices 11-12,helix 12, or F domain of a second species RXR according to theinvention, respectively.

[0173] In a preferred embodiment, the first polypeptide fragment of thechimeric RXR ligand binding domain comprises helices 1-6 of a firstspecies RXR according to the invention, and the second polypeptidefragment of the chimeric RXR ligand binding domain comprises helices7-12 of a second species RXR according to the invention.

[0174] In another preferred embodiment, the first polypeptide fragmentof the chimeric RXR ligand binding domain comprises helices 1-7 of afirst species RXR according to the invention, and the second polypeptidefragment of the chimeric RXR ligand binding domain comprises helices8-12 of a second species RXR according to the invention.

[0175] In another preferred embodiment, the first polypeptide fragmentof the chimeric RXR ligand binding domain comprises helices 1-8 of afirst species RXR according to the invention, and the second polypeptidefragment of the chimeric RXR ligand binding domain comprises helices9-12 of a second species RXR according to the invention.

[0176] In another preferred embodiment, the first polypeptide fragmentof the chimeric RXR ligand binding domain comprises helices 1-9 of afirst species RXR according to the invention, and the second polypeptidefragment of the chimeric RXR ligand binding domain comprises helices10-12 of a second species RXR according to the invention.

[0177] In another preferred embodiment, the first polypeptide fragmentof the chimeric RXR ligand binding domain comprises helices 1-10 of afirst species RXR according to the invention, and the second polypeptidefragment of the chimeric RXR ligand binding domain comprises helices11-12 of a second species RXR according to the invention.

[0178] In another preferred embodiment, the first polypeptide fragmentof the chimeric RXR ligand binding domain comprises helices 1-11 of afirst species RXR according to the invention, and the second polypeptidefragment of the chimeric RXR ligand binding domain comprises helix 12 ofa second species RXR according to the invention.

[0179] In another preferred embodiment, the first polypeptide fragmentof the chimeric RXR ligand binding domain comprises helices 1-12 of afirst species RXR according to the invention, and the second polypeptidefragment of the chimeric RXR ligand binding domain comprises an F domainof a second species RXR according to the invention.

[0180] In another specific embodiment, the LBD is from a truncatedchimeric RXR ligand binding domain. The chimeric RXR LBD truncationresults in a deletion of at least 1, 2, 3, 4, 5, 6, 8, 10, 13, 14, 15,16, 17, 18, 19, 20, 21, 22, 25, 26, 30, 35, 40, 45, 50, 55, 60, 65, 70,75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145,150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215,220, 225, 230, 235, or 240 amino acids. Preferably, the chimeric RXR LBDtruncation results in a deletion of at least a partial polypeptidedomain. More preferably, the chimeric RXR LBD truncation results in adeletion of at least an entire polypeptide domain. In a preferredembodiment, the chimeric RXR LBD truncation results in a deletion of atleast a partial E-domain, a complete E-domain, a partial F-domain, acomplete F-domain, an EF-domain helix 1, an EF-domain helix 2, anEF-domain helix 3, an EF-domain helix 4, an EF-domain helix 5, anEF-domain helix 6, an EF-domain helix 7, an EF-domain helix 8, andEF-domain helix 9, an EF-domain helix 10, an EF-domain helix 11, anEF-domain helix 12, or an EF-domain β-pleated sheet. A combination ofseveral partial and/or complete domain deletions may also be performed.

[0181] In a preferred embodiment, the truncated chimeric RXR ligandbinding domain is encoded by a polynucleotide comprising a nucleic acidsequence of SEQ ID NO: 33, SEQ ID NO: 34, SEQ ID NO: 35, SEQ ID NO: 36,SEQ ID NO: 37, or SEQ ID NO: 38. In another preferred embodiment, thetruncated chimeric RXR ligand binding domain comprises a nucleic acidsequence of SEQ ID NO: 39, SEQ ID NO: 40, SEQ ID NO: 41, SEQ ID NO: 42,SEQ ID NO: 43, or SEQ ID NO: 44.

[0182] In a preferred embodiment, the chimeric RXR ligand binding domainis encoded by a polynucleotide comprising a nucleic acid sequenceselected from the group consisting of a) SEQ ID NO: 45, b) nucleotides1-348 of SEQ ID NO: 13 and nucleotides 268-630 of SEQ ID NO: 21, c)nucleotides 1-408 of SEQ ID NO: 13 and nucleotides 337-630 of SEQ ID NO:21, d) nucleotides 1-465 of SEQ ID NO: 13 and nucleotides 403-630 of SEQID NO: 21, e) nucleotides 1-555 of SEQ ID NO: 13 and nucleotides 490-630of SEQ ID NO: 21, f) nucleotides 1-624 of SEQ ID NO: 13 and nucleotides547-630 of SEQ ID NO: 21, g) nucleotides 1-645 of SEQ ID NO: 13 andnucleotides 601-630 of SEQ ID NO: 21, and h) nucleotides 1-717 of SEQ IDNO: 13 and nucleotides 613-630 of SEQ ID NO: 21.

[0183] In another preferred embodiment, the chimeric RXR ligand bindingdomain comprises an amino acid sequence selected from the groupconsisting of a) SEQ ID NO: 46, b) amino acids 1-116 of SEQ ID NO: 13and amino acids 90-210 of SEQ ID NO: 21, c) amino acids 1-136 of SEQ IDNO: 13 and amino acids 113-210 of SEQ ID NO: 21, d) amino acids 1-155 ofSEQ ID NO: 13 and amino acids 135-210 of SEQ ID NO: 21, e) amino acids1-185 of SEQ ID NO: 13 and amino acids 164-210 of SEQ ID NO: 21, f)amino acids 1-208 of SEQ ID NO: 13 and amino acids 183-210 of SEQ ID NO:21, g) amino acids 1-215 of SEQ ID NO: 13 and amino acids 201-210 of SEQID NO: 21, and h) amino acids 1-239 of SEQ ID NO: 13 and amino acids205-210 of SEQ ID NO: 21.

[0184] For purposes of this invention, EcR, vertebrate RXR, invertebrateRXR, and chimeric RXR also include synthetic and hybrid EcR, vertebrateRXR, invertebrate RXR, and chimeric RXR, and their homologs.

[0185] The DNA binding domain can be any DNA binding domain with a knownresponse element, including synthetic and chimeric DNA binding domains,or analogs, combinations, or modifications thereof. Preferably, the DBDis a GAL4 DBD, a LexA DBD, a transcription factor DBD, a steroid/thyroidhormone nuclear receptor superfamily member DBD, a bacterial LacZ DBD,or a yeast put DBD. More preferably, the DBD is a GALA DBD [SEQ ID NO:47 (polynucleotide) or SEQ ID NO: 48 (polypeptide)] or a LexA DBD [(SEQID NO: 49 (polynucleotide) or SEQ ID NO: 50 (polypeptide)].

[0186] The transactivation domain (abbreviated “AD” or “TA”) may be anysteroid/thyroid hormone nuclear receptor AD, synthetic or chimeric AD,polyglutamine AD, basic or acidic amino acid AD, a VP16 AD, a GAL4 AD,an NF-κB AD, a BP64 AD, a B42 acidic activation domain (B42AD), or ananalog, combination, or modification thereof. In a specific embodiment,the AD is a synthetic or chimeric AD, or is obtained from a VP16, GAL4,NF-kB, or B42 acidic activation domain AD. Preferably, the AD is a VP16AD [SEQ ID NO: 51 (polynucleotide) or SEQ ID NO: 52 (polypeptide)] or aB42 AD [SEQ ID NO: 53 (polynucleotide) or SEQ ID NO: 54 (polypeptide)].

[0187] In a preferred embodiment, the gene expression cassette encodes ahybrid polypeptide comprising a DNA-binding domain encoded by apolynucleotide comprising a nucleic acid sequence selected from thegroup consisting of a GAL4 DBD (SEQ ID NO: 47) and a LexA DBD (SEQ IDNO: 49), and an EcR ligand binding domain encoded by a polynucleotidecomprising a nucleic acid sequence selected from the group consisting ofSEQ ID NO: 65 (CfEcR-DEF), SEQ ID NO: 59 (CfEcR-CDEF), SEQ ID NO: 67(DmEcR-DEF), SEQ ID NO: 71 (TmEcR-DEF) and SEQ ID NO: 73 (AmaEcR-DEF).

[0188] In another preferred embodiment, the gene expression cassetteencodes a hybrid polypeptide comprising a DNA-binding domain comprisingan amino acid sequence selected from the group consisting of a GAL4 DBD(SEQ ID NO: 48) and a LexA DBD (SEQ ID NO: 50), and an EcR ligandbinding domain comprising an amino acid sequence selected from the groupconsisting of SEQ ID NO: 57 (CfEcR-DEF), SEQ ID NO: 70 (CfEcR-CDEF), SEQID NO: 58 (DmEcR-DEF), SEQ ID NO: 72 (TmEcR-DEF), and SEQ ID NO: 74(AmaEcR-DEF).

[0189] In another preferred embodiment, the gene expression cassetteencodes a hybrid polypeptide comprising a DNA-binding domain encoded bya polynucleotide comprising a nucleic acid sequence selected from thegroup consisting of a GALA DBD (SEQ ID NO: 47) and a LexA DBD (SEQ IDNO: 49), and a chimeric RXR ligand binding domain encoded by apolynucleotide comprising a nucleic acid sequence selected from thegroup consisting of a) SEQ ID NO: 45, b) nucleotides 1-348 of SEQ ID NO:13 and nucleotides 268-630 of SEQ ID NO: 21, c) nucleotides 1-408 of SEQID NO: 13 and nucleotides 337-630 of SEQ ID NO: 21, d) nucleotides 1-465of SEQ ID NO: 13 and nucleotides 403-630 of SEQ ID NO: 21, e)nucleotides 1-555 of SEQ ID NO: 13 and nucleotides 490-630 of SEQ ID NO:21, f) nucleotides 1-624 of SEQ ID NO: 13 and nucleotides 547-630 of SEQID NO: 21, g) nucleotides 1-645 of SEQ ID NO: 13 and nucleotides 601-630of SEQ ID NO: 21, and h) nucleotides 1-717 of SEQ ID NO: 13 andnucleotides 613-630 of SEQ ID NO: 21.

[0190] In another preferred embodiment, the gene expression cassetteencodes a hybrid polypeptide comprising a DNA-binding domain comprisingan amino acid sequence selected from the group consisting of a GAL4 DBD(SEQ ID NO: 48) and a LexA DBD (SEQ ID NO: 50), and a chimeric RXRligand binding domain comprising an amino acid sequence selected fromthe group consisting of a) SEQ ID NO: 46, b) amino acids 1-116 of SEQ IDNO: 13 and amino acids 90-210 of SEQ ID NO: 21, c) amino acids 1-136 ofSEQ ID NO: 13 and amino acids 113-210 of SEQ ID NO: 21, d) amino acids1-155 of SEQ ID NO: 13 and amino acids 135-210 of SEQ ID NO: 21, e)amino acids 1-185 of SEQ ID NO: 13 and amino acids 164-210 of SEQ ID NO:21, f) amino acids 1-208 of SEQ ID NO: 13 and amino acids 183-210 of SEQID NO: 21, g) amino acids 1-215 of SEQ ID NO: 13 and amino acids 201-210of SEQ ID NO: 21, and h) amino acids 1-239 of SEQ ID NO: 13 and aminoacids 205-210 of SEQ ID NO: 21.

[0191] In another preferred embodiment, the gene expression cassetteencodes a hybrid polypeptide comprising a transactivation domain encodedby a polynucleotide comprising a nucleic acid sequence of SEQ ID NO: 51or SEQ ID NO: 53, and an EcR ligand binding domain encoded by apolynucleotide comprising a nucleic acid sequence selected from thegroup consisting of SEQ ID NO: 65 (CfEcR-DEF), SEQ ID NO: 59(CfEcR-CDEF), SEQ ID NO: 67 (DmEcR-DEF), SEQ ID NO: 71 (TmEcR-DEF) andSEQ ID NO: 73 (AmaEcR-DEF).

[0192] In another preferred embodiment, the gene expression cassetteencodes a hybrid polypeptide comprising a transactivation domaincomprising an amino acid sequence of SEQ ID NO: 52 or SEQ ID NO: 54, andan EcR ligand binding domain comprising an amino acid sequence selectedfrom the group consisting of SEQ ID NO: 57 (CfEcR-DEF), SEQ ID NO: 70(CfEcR-CDEF), SEQ ID NO: 58 (DmEcR-DEF), SEQ ID NO: 72 (TmEcR-DEF), andSEQ ID NO: 74 (AmaEcR-DEF).

[0193] In another preferred embodiment, the gene expression cassetteencodes a hybrid polypeptide comprising a transactivation domain encodedby a polynucleotide comprising a nucleic acid sequence of SEQ ID NO: 51or SEQ ID NO: 53 and a chimeric RXR ligand binding domain encoded by apolynucleotide comprising a nucleic acid sequence selected from thegroup consisting of a) SEQ ID NO: 45, b) nucleotides 1-348 of SEQ ID NO:13 and nucleotides 268-630 of SEQ ID NO: 21, c) nucleotides 1-408 of SEQID NO: 13 and nucleotides 337-630 of SEQ ID NO: 21, d) nucleotides 1-465of SEQ ID NO: 13 and nucleotides 403-630 of SEQ ID NO: 21, e)nucleotides 1-555 of SEQ ID NO: 13 and nucleotides 490-630 of SEQ ID NO:21, f) nucleotides 1-624 of SEQ ID NO: 13 and nucleotides 547-630 of SEQID NO: 21, g) nucleotides 1-645 of SEQ ID NO: 13 and nucleotides 601-630of SEQ ID NO: 21, and h) nucleotides 1-717 of SEQ ID NO: 13 andnucleotides 613-630 of SEQ ID NO: 21.

[0194] In another preferred embodiment, the gene expression cassetteencodes a hybrid polypeptide comprising a transactivation domaincomprising an amino acid sequence of SEQ ID NO: 52 or SEQ ID NO: 54 anda chimeric RXR ligand binding domain comprising an amino acid sequenceselected from the group consisting of a) SEQ ID NO: 46, b) amino acids1-116 of SEQ ID NO: 13 and amino acids 90-210 of SEQ ID NO: 21, c) aminoacids 1-136 of SEQ ID NO: 13 and amino acids 113-210 of SEQ ID NO: 21,d) amino acids 1-155 of SEQ ID NO: 13 and amino acids 135-210 of SEQ IDNO: 21, e) amino acids 1-185 of SEQ ID NO: 13 and amino acids 164-210 ofSEQ ID NO: 21, f) amino acids 1-208 of SEQ ID NO: 13 and amino acids183-210 of SEQ ID NO: 21, g) amino acids 1-215 of SEQ ID NO: 13 andamino acids 201-210 of SEQ ID NO: 21, and h) amino acids 1-239 of SEQ IDNO: 13 and amino acids 205-210 of SEQ ID NO: 21.

[0195] The response element (“RE”) may be any response element with aknown DNA binding domain, or an analog, combination, or modificationthereof. A single RE may be employed or multiple REs, either multiplecopies of the same RE or two or more different REs, may be used in thepresent invention. In a specific embodiment, the RE is an RE from GAL4(“GAL4RE”), LexA, a steroid/thyroid hormone nuclear receptor RE, or asynthetic RE that recognizes a synthetic DNA binding domain. Preferably,the RE is a GAL4RE comprising a polynucleotide sequence of SEQ ID NO: 55or a LexARE (operon “op”) comprising a polynucleotide sequence of SEQ IDNO: 56 (2XLexAop). Preferably, the first hybrid protein is substantiallyfree of a transactivation domain and the second hybrid protein issubstantially free of a DNA binding domain. For purposes of thisinvention, “substantially free” means that the protein in question doesnot contain a sufficient sequence of the domain in question to provideactivation or binding activity.

[0196] Thus, the present invention also relates to a gene expressioncassette comprising: i) a response element comprising a domain to whicha polypeptide comprising a DNA binding domain binds; ii) a promoter thatis activated by a polypeptide comprising a transactivation domain; andiii) a gene whose expression is to be modulated.

[0197] Genes of interest for use in Applicants' gene expressioncassettes may be endogenous genes or heterologous genes. Nucleic acid oramino acid sequence information for a desired gene or protein can belocated in one of many public access databases, for example, GENBANK,EMBL, Swiss-Prot, and PIR, or in many biology related journalpublications. Thus, those skilled in the art have access to nucleic acidsequence information for virtually all known genes. Such information canthen be used to construct the desired constructs for the insertion ofthe gene of interest within the gene expression cassettes used inApplicants' methods described herein.

[0198] Examples of genes of interest for use in Applicants' geneexpression cassettes include, but are not limited to: genes encodingtherapeutically desirable polypeptides or products that may be used totreat a condition, a disease, a disorder, a dysfunction, a geneticdefect, such as monoclonal antibodies, enzymes, proteases, cytokines,interferons, insulin, erthropoietin, clotting factors, other bloodfactors or components, viral vectors for gene therapy, virus forvaccines, targets for drug discovery, functional genomics, andproteomics analyses and applications, and the like.

POLYNUCLEOTIDES OF THE INVENTION

[0199] The novel ecdysone receptor/chimeric retinoid X receptor-basedinducible gene expression system of the invention comprises a geneexpression cassette comprising a polynucleotide that encodes a hybridpolypeptide comprising a) a DNA binding domain or a transactivationdomain, and b) an EcR ligand binding domain or a chimeric RXR ligandbinding domain. These gene expression cassettes, the polynucleotidesthey comprise, and the hybrid polypeptides they encode are useful ascomponents of an EcR-based gene expression system to modulate theexpression of a gene within a host cell.

[0200] Thus, the present invention provides an isolated polynucleotidethat encodes a hybrid polypeptide comprising a) a DNA binding domain ora transactivation domain according to the invention, and b) an EcRligand binding domain or a chimeric RXR ligand binding domain accordingto the invention.

[0201] The present invention also relates to an isolated polynucleotidethat encodes a chimeric RXR ligand binding domain according to theinvention.

[0202] The present invention also relates to an isolated polynucleotidethat encodes a truncated EcR LBD or a truncated chimeric RXR LBDcomprising a truncation mutation according to the invention.Specifically, the present invention relates to an isolatedpolynucleotide encoding a truncated EcR or chimeric RXR ligand bindingdomain comprising a truncation mutation that affects ligand bindingactivity or ligand sensitivity that is useful in modulating geneexpression in a host cell.

[0203] In a specific embodiment, the isolated polynucleotide encoding anEcR LBD comprises a polynucleotide sequence selected from the groupconsisting of SEQ ID NO: 65 (CfEcR-DEF), SEQ ID NO: 59 (CfEcR-CDEF), SEQID NO: 67 (DmEcR-DEF), SEQ ID NO: 71 (TmEcR-DEF) and SEQ ID NO: 73(AmaEcR-DEF).

[0204] In another specific embodiment, the isolated polynucleotideencodes an EcR LBD comprising an amino acid sequence selected from thegroup consisting of SEQ ID NO: 57 (CfEcR-DEF), SEQ ID NO: 70(CfEcR-CDEF), SEQ ID NO: 58 (DmEcR-DEF), SEQ ID NO: 72 (TmEcR-DEF), andSEQ ID NO: 74 (AmaEcR-DEF).

[0205] In another specific embodiment, the isolated polynucleotideencoding a chimeric RXR LBD comprises a polynucleotide sequence selectedfrom the group consisting of a) SEQ ID NO: 45, b) nucleotides 1-348 ofSEQ ID NO: 13 and nucleotides 268-630 of SEQ ID NO: 21, c) nucleotides1-408 of SEQ ID NO: 13 and nucleotides 337-630 of SEQ ID NO: 21, d)nucleotides 1-465 of SEQ ID NO: 13 and nucleotides 403-630 of SEQ ID NO:21, e) nucleotides 1-555 of SEQ ID NO: 13 and nucleotides 490-630 of SEQID NO: 21, f) nucleotides 1-624 of SEQ ID NO: 13 and nucleotides 547-630of SEQ ID NO: 21, g)nucleotides 1-645 of SEQ ID NO: 13 and nucleotides601-630 of SEQ ID NO: 21, and h) nucleotides 1-717 of SEQ I) NO: 13 andnucleotides 613-630 of SEQ ID NO: 21.

[0206] In another specific embodiment, the isolated polynucleotideencodes a chimeric RXR LBD comprising an amino acid sequence consistingof a) SEQ ID NO: 46, b) amino acids 1-116 of SEQ ID NO: 13 and aminoacids 90-210 of SEQ ID NO: 21, c) amino acids 1-136 of SEQ ID NO: 13 andamino acids 113-210 of SEQ ID NO: 21, d) amino acids 1-155 of SEQ ID NO:13 and amino acids 135-210 of SEQ ID NO: 21, e) amino acids 1-185 of SEQID NO: 13 and amino acids 164-210 of SEQ ID NO: 21, f) amino acids 1-208of SEQ ID NO: 13 and amino acids 183-210 of SEQ ID NO: 21, g) aminoacids 1-215 of SEQ ID NO: 13 and amino acids 201-210 of SEQ ID NO: 21,and h) amino acids 1-239 of SEQ ID NO: 13 and amino acids 205-210 of SEQID NO: 21.

[0207] In particular, the present invention relates to an isolatedpolynucleotide encoding a truncated chimeric RXR LBD comprising atruncation mutation, wherein the mutation reduces ligand bindingactivity or ligand sensitivity of the truncated chimeric RXR LBD. In aspecific embodiment, the present invention relates to an isolatedpolynucleotide encoding a truncated chimeric RXR LBD comprising atruncation mutation that reduces steroid binding activity or steroidsensitivity of the truncated chimeric RXR LBD.

[0208] In another specific embodiment, the present invention relates toan isolated polynucleotide encoding a truncated chimeric RXR LBDcomprising a truncation mutation that reduces non-steroid bindingactivity or non-steroid sensitivity of the truncated chimeric RXR LBD.

[0209] The present invention also relates to an isolated polynucleotideencoding a truncated chimeric RXR LBD comprising a truncation mutation,wherein the mutation enhances ligand binding activity or ligandsensitivity of the truncated chimeric RXR LBD. In a specific embodiment,the present invention relates to an isolated polynucleotide encoding atruncated chimeric RXR LBD comprising a truncation mutation thatenhances steroid binding activity or steroid sensitivity of thetruncated chimeric RXR LBD.

[0210] In another specific embodiment, the present invention relates toan isolated polynucleotide encoding a truncated chimeric RXR LBDcomprising a truncation mutation that enhances non-steroid bindingactivity or non-steroid sensitivity of the truncated chimeric RXR LBD.

[0211] The present invention also relates to an isolated polynucleotideencoding a truncated chimeric retinoid X receptor LBD comprising atruncation mutation that increases ligand sensitivity of a heterodimercomprising the truncated chimeric retinoid X receptor LBD and adimerization partner. In a specific embodiment, the dimerization partneris an ecdysone receptor polypeptide. Preferably, the dimerizationpartner is a truncated EcR polypeptide. More preferably, thedimerization partner is an EcR polypeptide in which domains A/B havebeen deleted. Even more preferably, the dimerization partner is an EcRpolypeptide comprising an amino acid sequence of SEQ ID NO: 57(CfEcR-DEF), SEQ ID NO: 58 (DmEcR-DEF), SEQ ID NO: 70 (CfEcR-CDEF), SEQID NO: 72 (TmEcR-DEF) or SEQ ID NO: 74 (AmaEcR-DEF).

POLYPEPTIDES OF THE INVENTION

[0212] The novel ecdysone receptor/chimeric retinoid X receptor-basedinducible gene expression system of the invention comprises a geneexpression cassette comprising a polynucleotide that encodes a hybridpolypeptide comprising a) a DNA binding domain or a transactivationdomain, and b) an EcR ligand binding domain or a chimeric RXR ligandbinding domain. These gene expression cassettes, the polynucleotidesthey comprise, and the hybrid polypeptides they encode are useful ascomponents of an EcR/chimeric RXR-based gene expression system tomodulate the expression of a gene within a host cell.

[0213] Thus, the present invention also relates to a hybrid polypeptidecomprising a) a DNA binding domain or a transactivation domain accordingto the invention, and b) an EcR ligand binding domain or a chimeric RXRligand binding domain according to the invention.

[0214] The present invention also relates to an isolated polypeptidecomprising a chimeric RXR ligand binding domain according to theinvention.

[0215] The present invention also relates to an isolated truncated EcRLBD or an isolated truncated chimeric RXR LBD comprising a truncationmutation according to the invention. Specifically, the present inventionrelates to an isolated truncated EcR LBD or an isolated truncatedchimeric RXR LBD comprising a truncation mutation that affects ligandbinding activity or ligand sensitivity.

[0216] In a specific embodiment, the isolated EcR LBD polypeptide isencoded by a polynucleotide comprising a polynucleotide sequenceselected from the group consisting of SEQ ID NO: 65 (CfEcR-DEF), SEQ IDNO: 59 (CfEcR-CDEF), SEQ ID NO: 67 (DmEcR-DEF), SEQ ID NO: 71(TmEcR-DEF) and SEQ ID NO: 73 (AmaEcR-DEF).

[0217] In another specific embodiment, the isolated EcR LBD polypeptidecomprises an amino acid sequence selected from the group consisting ofSEQ ID NO: 57 (CfEcR-DEF), SEQ ID NO: 70 (CfEcR-CDEF), SEQ ID NO: 58(DmEcR-DEF), SEQ ID NO: 72 (TmEcR-DEF), and SEQ ID NO: 74 (AmaEcR-DEF).

[0218] In another specific embodiment, the isolated truncated chimericRXR LBD is encoded by a polynucleotide comprising a polynucleotidesequence selected from the group consisting of a) SEQ ID NO: 45, b)nucleotides 1-348 of SEQ ID NO: 13 and nucleotides 268-630 of SEQ ID NO:21, c) nucleotides 1-408 of SEQ ID NO: 13 and nucleotides 337-630 of SEQID NO: 21, d) nucleotides 1-465 of SEQ ID NO: 13 and nucleotides 403-630of SEQ ID NO: 21, e) nucleotides 1-555 of SEQ ID NO: 13 and nucleotides490-630 of SEQ ED NO: 21, f) nucleotides 1-624 of SEQ ID NO: 13 andnucleotides 547-630 of SEQ ID NO: 21, g) nucleotides 1-645 of SEQ ID NO:13 and nucleotides 601-630 of SEQ ID NO: 21, and h) nucleotides 1-717 ofSEQ ID NO: 13 and nucleotides 613-630 of SEQ ID NO: 21.

[0219] In another specific embodiment, the isolated truncated chimericRXR LBD comprises an amino acid sequence selected from the groupconsisting of a) SEQ ID NO: 46, b) amino acids 1-116 of SEQ ID NO: 13and amino acids 90-210 of SEQ ID NO: 21, c) amino acids 1-136 of SEQ IDNO:. 13 and amino acids 113-210 of SEQ ID NO: 21, d) amino acids 1-155of SEQ ID NO: 13 and amino acids 135-210 of SEQ ID NO: 21, e) aminoacids 1-185 of SEQ ID NO: 13 and amino acids 164-210 of SEQ ID NO: 21,f) amino acids 1-208 of SEQ ID NO: 13 and amino acids 183-210 of SEQ IDNO: 21, g) amino acids 1-215 of SEQ ID NO: 13 and amino acids 201-210 ofSEQ ID NO: 21, and h) amino acids 1-239 of SEQ ID NO: 13 and amino acids205-210 of SEQ ID NO: 21.

[0220] The present invention relates to an isolated truncated chimericRXR LBD comprising a truncation mutation that reduces ligand bindingactivity or ligand sensitivity of the truncated chimeric RXR LBD.

[0221] Thus, the present invention relates to an isolated truncatedchimeric RXR LBD comprising a truncation mutation that reduces ligandbinding activity or ligand sensitivity of the truncated chimeric RXRLBD.

[0222] In a specific embodiment, the present invention relates to anisolated truncated chimeric RXR LBD comprising a truncation mutationthat reduces steroid binding activity or steroid sensitivity of thetruncated chimeric RXR LBD.

[0223] In another specific embodiment, the present invention relates toan isolated truncated chimeric RXR LBD comprising a truncation mutationthat reduces non-steroid binding activity or non-steroid sensitivity ofthe truncated chimeric RXR LBD.

[0224] In addition, the present invention relates to an isolatedtruncated chimeric RXR LBD comprising a truncation mutation thatenhances ligand binding activity or ligand sensitivity of the truncatedchimeric RXR LBD.

[0225] The present invention relates to an isolated truncated chimericRXR LBD comprising a truncation mutation that enhances ligand bindingactivity or ligand sensitivity of the truncated chimeric RXR LBD. In aspecific embodiment, the present invention relates to an isolatedtruncated chimeric RXR LBD comprising a truncation mutation thatenhances steroid binding activity or steroid sensitivity of thetruncated chimeric RXR LBD.

[0226] In another specific embodiment, the present invention relates toan isolated truncated chimeric RXR LBD comprising a truncation mutationthat enhances non-steroid binding activity or non-steroid sensitivity ofthe truncated chimeric RXR LBD.

[0227] The present invention also relates to an isolated truncatedchimeric RXR LBD comprising a truncation mutation that increases ligandsensitivity of a heterodimer comprising the truncated chimeric RXR LBDand a dimerization partner.

[0228] In a specific embodiment, the dimerization partner is an ecdysonereceptor polypeptide. Preferably, the dimerization partner is atruncated EcR polypeptide. Preferably, the dimerization partner is anEcR polypeptide in which domains A/B or A/B/C have been deleted. Evenmore preferably, the dimerization partner is an EcR polypeptidecomprising an amino acid sequence of SEQ ID NO: 57 (CfEcR-DEF), SEQ IDNO: 58 (DmEcR-DEF), SEQ ID NO: 70 (CfEcR-CDEF), SEQ ID NO: 72(TmEcR-DEF) or SEQ ID NO: 74 (AmaEcR-DEF).

METHOD OF MODULATING GENE EXPRESSION OF THE INVENTION

[0229] Applicants' invention also relates to methods of modulating geneexpression in a host cell using a gene expression modulation systemaccording to the invention. Specifically, Applicants' invention providesa method of modulating the expression of a gene in a host cellcomprising the steps of: a) introducing into the host cell a geneexpression modulation system according to the invention; and b)introducing into the host cell a ligand; wherein the gene to bemodulated is a component of a gene expression cassette comprising: i) aresponse element comprising a domain recognized by the DNA bindingdomain of the first hybrid polypeptide; ii) a promoter that is activatedby the transactivation domain of the second hybrid polypeptide; and iii)a gene whose expression is to be modulated, whereby upon introduction ofthe ligand into the host cell, expression of the gene is modulated.

[0230] The invention also provides a method of modulating the expressionof a gene in a host cell comprising the steps of: a) introducing intothe host cell a gene expression modulation system according to theinvention; b) introducing into the host cell a gene expression cassettecomprising i) a response element comprising a domain recognized by theDNA binding domain from the first hybrid polypeptide; ii) a promoterthat is activated by the transactivation domain of the second hybridpolypeptide; and iii) a gene whose expression is to be modulated; and c)introducing into the host cell a ligand; whereby expression of the geneis modulated in the host cell.

[0231] Genes of interest for expression in a host cell using Applicants'methods may be endogenous genes or heterologous genes. Nucleic acid oramino acid sequence information for a desired gene or protein can belocated in one of many public access databases, for example, GENBANK,EMBL, Swiss-Prot, and PIR, or in many biology related journalpublications. Thus, those skilled in the art have access to nucleic acidsequence information for virtually all known genes. Such information canthen be used to construct the desired constructs for the insertion ofthe gene of interest within the gene expression cassettes used inApplicants' methods described herein.

[0232] Examples of genes of interest for expression in a host cell usingApplicants' methods include, but are not limited to: genes encodingtherapeutically desirable polypeptides or products that may be used totreat a condition, a disease, a disorder, a dysfunction, a geneticdefect, such as monoclonal antibodies, enzymes, proteases, cytokines,interferons, insulin, erthropoietin, clotting factors, other bloodfactors or components, viral vectors for gene therapy, virus forvaccines, targets for drug discovery, functional genomics, andproteomics analyses and applications, and the like.

[0233] Acceptable ligands are any that modulate expression of the genewhen binding of the DNA binding domain of the two-hybrid system to theresponse element in the presence of the ligand results in activation orsuppression of expression of the genes. Preferred ligands includeponasterone, muristerone A, 9-cis-retinoic acid, synthetic analogs ofretinoic acid, N,N′-diacylhydrazines such as those disclosed in U.S.Pat. Nos. 6,013,836; 5,117,057; 5,530,028; and 5,378,726; dibenzoylalkylcyanohydrazines such as those disclosed in European Application No.461,809;-N-alkyl-N,N′-diaroylhydrazines such as those disclosed in U.S.Pat. No. 5,225,443; N-acyl-N-alkylcarbonylhydrazines such as thosedisclosed in European Application No. 234,994;N-aroyl-N-alkyl-N′-aroylhydrazines such as those described in U.S. Pat.No. 4,985,461; each of which is incorporated herein by reference andother similar materials including3,5-di-tert-butyl-4-hydroxy-N-isobutyl-benzamide, 8-O-acetylharpagide,and the like.

[0234] In a preferred embodiment, the ligand for use in Applicants'method of modulating expression of gene is a compound of the formula:

[0235] wherein:

[0236] E is a (C₄-C₆)alkyl containing a tertiary carbon or acyano(C₃-C₅)alkyl containing a tertiary carbon;

[0237] R¹ is H, Me, Et, i-Pr, F, formyl, CF₃, CHF₂, CHCl₂, CH₂F, CH₂Cl,CH₂OH, CH₂OMe, CH₂CN, CN, C°CH, 1-propynyl, 2-propynyl, vinyl, OH, OMe,OEt, cyclopropyl, CF₂CF₃, CH═CHCN, allyl, azido, SCN, or SCHF₂;

[0238] R² is H, Me, Et, n-Pr, i-Pr, formyl, CF₃, CHF₂, CHCl₂, CH₂F,CH₂Cl, CH₂OH, CH₂OMe, CH₂CN, CN, C°CH, 1-propynyl, 2-propynyl, vinyl,Ac, F, Cl, OH, OMe, OEt, O-n-Pr, OAc, NMe₂, NEt₂, SMe, SEt, SOCF₃,OCF₂CF₂H, COEt, cyclopropyl, CF₂CF₃, CH═CHCN, allyl, azido, OCF₃, OCHF₂,O-i-Pr, SCN, SCHF₂, SOMe, NH—CN, or joined with R³ and the phenylcarbons to which R² and R³ are attached to form an ethylenedioxy, adihydrofuryl ring with the oxygen adjacent to a phenyl carbon, or adihydropyryl ring with the oxygen adjacent to a phenyl carbon;

[0239] R³ is H, Et, or joined with R² and the phenyl carbons to which R²and R³ are attached to form an ethylenedioxy, a dihydrofuryl ring withthe oxygen adjacent to a phenyl carbon, or a dihydropyryl ring with theoxygen adjacent to a phenyl carbon;

[0240] R⁴, R⁵, and R⁶ are independently H, Me, Et, F, Cl, Br, formyl,CF₃, CHF₂, CHCl₂, CH₂F, CH₂Cl, CH₂OH, CN, C°CH, 1-propynyl, 2-propynyl,vinyl, OMe, OEt, SMe, or SEt.

[0241] In another preferred embodiment, a second ligand may be used inaddition to the first ligand discussed above in Applicants' method ofmodulating expression of a gene, wherein the second ligand is9-cis-retinoic acid or a synthetic analog of retinoic acid.

[0242] Applicants' invention provides for modulation of gene expressionin prokaryotic and eukaryotic host cells. Thus, the present inventionalso relates to a method for modulating gene expression in a host cellselected from the group consisting of a bacterial cell, a fungal cell, ayeast cell, an animal cell, and a mammalian cell. Preferably, the hostcell is a yeast cell, a hamster cell, a mouse cell, a monkey cell, or ahuman cell.

[0243] Expression in transgenic host cells may be useful for theexpression of various polypeptides of interest including but not limitedto therapeutic polypeptides, pathway intermediates; for the modulationof pathways already existing in the host for the synthesis of newproducts heretofore not possible using the host; cell based assays;functional genomics assays, biotherapeutic protein production,proteomics assays, and the like. Additionally the gene products may beuseful for conferring higher growth yields of the host or for enablingan alternative growth mode to be utilized.

HOST CELLS AND NON-HUMAN ORGANISMS OF THE INVENTION

[0244] As described above, the gene expression modulation system of thepresent invention may be used to modulate gene expression in a hostcell. Expression in transgenic host cells may be useful for theexpression of various genes of interest. Thus, Applicants' inventionprovides an isolated host cell comprising a gene expression systemaccording to the invention. The present invention also provides anisolated host cell comprising a gene expression cassette according tothe invention. Applicants' invention also provides an isolated host cellcomprising a polynucleotide or a polypeptide according to the invention.The isolated host cell may be either a prokaryotic or a eukaryotic hostcell.

[0245] Preferably, the host cell is selected from the group consistingof a bacterial cell, a fungal cell, a yeast cell, an animal cell, and amammalian cell. Examples of preferred host cells include, but are notlimited to, fungal or yeast species such as Aspergillus, Trichoderma,Saccharomyces, Pichia, Candida, Hansenula, or bacterial species such asthose in the genera Synechocystis, Synechococcus, Salmonella, Bacillus,Acinetobacter, Rhodococcus, Streptomyces, Escherichia, Pseudomonas,Methylomonas, Methylobacter, Alcaligenes, Synechocystis, Anabaena,Thiobacillus, Methanobacterium and Klebsiella, animal, and mammalianhost cells.

[0246] In a specific embodiment, the host cell is a yeast cell selectedfrom the group consisting of a Saccharomyces, a Pichia, and a Candidahost cell.

[0247] In another specific embodiment, the host cell is a hamster cell.

[0248] In another specific embodiment, the host cell is a murine cell.

[0249] In another specific embodiment, the host cell is a monkey cell.

[0250] In another specific embodiment, the host cell is a human cell.

[0251] Host cell transformation is well known in the art and may beachieved by a variety of methods including but not limited toelectroporation, viral infection, plasmid/vector transfection, non-viralvector mediated transfection, particle bombardment, and the like.Expression of desired gene products involves culturing the transformedhost cells under suitable conditions and inducing expression of thetransformed gene. Culture conditions and gene expression protocols inprokaryotic and eukaryotic cells are well known in the art (see GeneralMethods section of Examples). Cells may be harvested and the geneproducts isolated according to protocols specific for the gene product.

[0252] In addition, a host cell may be chosen which modulates theexpression of the inserted polynucleotide, or modifies and processes thepolypeptide product in the specific fashion desired. Different hostcells have characteristic and specific mechanisms for the translationaland post-translational processing and modification [e.g., glycosylation,cleavage (e.g. of signal sequence)] of proteins. Appropriate cell linesor host systems can be chosen to ensure the desired modification andprocessing of the foreign protein expressed. For example, expression ina bacterial system can be used to produce a non-glycosylated coreprotein product. However, a polypeptide expressed in bacteria may not beproperly folded. Expression in yeast can produce a glycosylated product.Expression in eukaryotic cells can increase the likelihood of “native”glycosylation and folding of a heterologous protein. Moreover,expression in mammalian cells can provide a tool for reconstituting, orconstituting, the polypeptide's activity. Furthermore, differentvector/host expression systems may affect processing reactions, such asproteolytic cleavages, to a different extent.

[0253] Applicants' invention also relates to a non-human organismcomprising an isolated host cell according to the invention. Preferably,the non-human organism is selected from the group consisting of abacterium, a fungus, a yeast, an animal, and a mammal. More preferably,the non-human organism is a yeast, a mouse, a rat, a rabbit, a cat, adog, a bovine, a goat, a pig, a horse, a sheep, a monkey, or achimpanzee.

[0254] In a specific embodiment, the non-human organism is a yeastselected from the group consisting of Saccharomyces, Pichia, andCandida.

[0255] In another specific embodiment, the non-human organism is a Musmusculus mouse.

MEASURING GENE EXPRESSION/TRANSCRIPTION

[0256] One useful measurement of Applicants' methods of the invention isthat of the transcriptional state of the cell including the identitiesand abundances of RNA, preferably mRNA species. Such measurements areconveniently conducted by measuring cDNA abundances by any of severalexisting gene expression technologies.

[0257] Nucleic acid array technology is a useful technique fordetermining differential mRNA expression. Such technology includes, forexample, oligonucleotide chips and DNA microarrays. These techniquesrely on DNA fragments or oligonucleotides which correspond to differentgenes or cDNAs which are immobilized on a solid support and hybridizedto probes prepared from total mRNA pools extracted from cells, tissues,or whole organisms and converted to cDNA. Oligonucleotide chips arearrays of oligonucleotides synthesized on a substrate usingphotolithographic techniques. Chips have been produced which can analyzefor up to 1700 genes. DNA microarrays are arrays of DNA samples,typically PCR products, that are robotically printed onto a microscopeslide. Each gene is analyzed by a full or partial-length target DNAsequence. Microarrays with up to 10,000 genes are now routinely preparedcommercially. The primary difference between these two techniques isthat oligonucleotide chips typically utilize 25-mer oligonucleotideswhich allow fractionation of short DNA molecules whereas the larger DNAtargets of microarrays, approximately 1000 base pairs, may provide moresensitivity in fractionating complex DNA mixtures.

[0258] Another useful measurement of Applicants' methods of theinvention is that of determining the translation state of the cell bymeasuring the abundances of the constituent protein species present inthe cell using processes well known in the art.

[0259] Where identification of genes associated with variousphysiological functions is desired, an assay may be employed in whichchanges in such functions as cell growth, apoptosis, senescence,differentiation, adhesion, binding to a specific molecules, binding toanother cell, cellular organization, organogenesis, intracellulartransport, transport facilitation, energy conversion, metabolism,myogenesis, neurogenesis, and/or hematopoiesis is measured.

[0260] In addition, selectable marker or reporter gene expression may beused to measure gene expression modulation using Applicants' invention.

[0261] Other methods to detect the products of gene expression are wellknown in the art and include Southern blots (DNA detection), dot or slotblots (DNA, RNA), northern blots (RNA), RT-PCR (RNA), western blots(polypeptide detection), and ELISA (polypeptide) analyses. Although lesspreferred, labeled proteins can be used to detect a particular nucleicacid sequence to which it hybidizes.

[0262] In some cases it is necessary to amplify the amount of a nucleicacid sequence. This may be carried out using one or more of a number ofsuitable methods including, for example, polymerase chain reaction(“PCR”), ligase chain reaction (“LCR”), strand displacementamplification (“SDA”), transcription-based amplification, and the like.PCR is carried out in accordance with known techniques in which, forexample, a nucleic acid sample is treated in the presence of a heatstable DNA polymerase, under hybridizing conditions, with one pair ofoligonucleotide primers, with one primer hybridizing to one strand(template) of the specific sequence to be detected. The primers aresufficiently complementary to each template strand of the specificsequence to hybridize therewith. An extension product of each primer issynthesized and is complementary to the nucleic acid template strand towhich it hybridized. The extension product synthesized from each primercan also serve as a template for further synthesis of extension productsusing the same primers. Following a sufficient number of rounds ofsynthesis of extension products, the sample may be analyzed as describedabove to assess whether the sequence or sequences to be detected arepresent.

[0263] The present invention may be better understood by reference tothe following non-limiting Examples, which are provided as exemplary ofthe invention.

EXAMPLES

[0264] GENERAL METHODS

[0265] Standard recombinant DNA and molecular cloning techniques usedherein are well known in the art and are described by Sambrook, J.,Fritsch, E. F. and Maniatis, T. Molecular Cloning: A Laboratory Manual;Cold Spring Harbor Laboratory Press: Cold Spring Harbor, N.Y. (1989)(Maniatis) and by T. J. Silhavy, M. L. Bennan, and L. W. Enquist,Experiments with Gene Fusions, Cold Spring Harbor Laboratory, ColdSpring Harbor, N.Y. (1984) and by Ausubel, F. M. et al., CurrentProtocols in Molecular Biology, Greene Publishing Assoc. andWiley-Interscience (1987).

[0266] Materials and methods suitable for the maintenance and growth ofbacterial cultures are well known in the art. Techniques suitable foruse in the following examples may be found as set out in Manual ofMethods for General Bacteriology (Phillipp Gerhardt, R. G. E. Murray,Ralph N. Costilow, Eugene W. Nester, Willis A. Wood, Noel R. Krieg andG. Briggs Phillips, eds), American Society for Microbiology, Washington,D.C. (1994)) or by Thomas D. Brock in Biotechnology: A Textbook ofIndustrial Microbiology, Second Edition, Sinauer Associates, Inc.,Sunderland, Mass. (1989). All reagents, restriction enzymes andmaterials used for the growth and maintenance of host cells wereobtained from Aldrich Chemicals (Milwaukee, Wis.), DIFCO Laboratories(Detroit, Mich.), GIBCO/BRL (Gaithersburg, Md.), or Sigma ChemicalCompany (St. Louis, Mo.) unless otherwise specified.

[0267] Manipulations of genetic sequences may be accomplished using thesuite of programs available from the Genetics Computer Group Inc.(Wisconsin Package Version 9.0, Genetics Computer Group (GCG), Madison,Wis.). Where the GCG program “Pileup” is used the gap creation defaultvalue of 12, and the gap extension default value of 4 may be used. Wherethe CGC “Gap” or “Bestfit” program is used the default gap creationpenalty of 50 and the default gap extension penalty of 3 may be used. Inany case where GCG program parameters are not prompted for, in these orany other GCG program, default values may be used.

[0268] The meaning of abbreviations is as follows: “h” means hour(s),“min” means minute(s), “sec” means second(s), “d” means day(s), “μl”means microliter(s), “ml” means milliliter(s), “L” means liter(s), “μM”means micromolar, “mM” means millimolar, “μg” means microgram(s), “mg”means milligram(s), “A” means adenine or adenosine, “T” means thymine orthymidine, “G” means guanine or guanosine, “C” means cytidine orcytosine, “x g” means times gravity, “nt” means nucleotide(s), “aa”means amino acid(s), “bp” means base pair(s), “kb” means kilobase(s),“k” means kilo, “μ” means micro, and “° C.” means degrees Celsius.

Example 1

[0269] Applicants' EcR/chimeric RXR-based inducible gene expressionmodulation system is useful in various applications including genetherapy, expression of proteins of interest in host cells, production oftransgenic organisms, and cell-based assays. Applicants have made thesurprising discovery that a chimeric retinoid X receptor ligand bindingdomain can substitute for either parent RXR polypeptide and functioninducibly in an EcR/chimeric RXR- based gene expression modulationsystem upon binding of ligand. In addition, the chimeric RXR polypeptidemay also function better than either parent/donor RXR ligand bindingdomain. Applicants' surprising discovery and unexpected superior resultsprovide a novel inducible gene expression system for bacterial, fungal,yeast, animal, and mammalian cell applications. This Example describesthe construction of several gene expression cassettes for use in theEcR/chimeric RXR-based inducible gene expression system of theinvention.

[0270] Applicants constructed several EcR-based gene expressioncassettes based on the spruce budworm Choristoneura fumiferana EcR(“CfEcR”), C. fumiferana ultraspiracle (“CfUSP”), Drosophilamelanogaster EcR (“DmEcR”), D. melanogaster USP (“DmUSP”), Tenebriomolitor EcR (“TmEcR”), Amblyomma americanum EcR(“AmaEcR”), A. americanumRXR homolog 1 (“AmaRXR1”), A. americanum RXR homolog 2 (“AmaRXR2”),mouse Mus musculus retinoid X receptor α isoform (“MmRXRα”), human Homosapiens retinoid X receptor β isoform (“HsRXRβ”), and locust Locustamigratoria ultraspiracle (“LmUSP”).

[0271] The prepared receptor constructs comprise 1) an EcR ligandbinding domain (LBD), a vertebrate RXR (MmRXRα or HsRXRβ) LBD, aninvertebrate USP (CfUSP or DmUSP) LBD, an invertebrate RXR (LmUSP,AmaRXR1 or AmaRXR2) LBD, or a chimeric RXR LBD comprising a vertebrateRXR LBD fragment and an invertebrate RXR LBD fragment; and 2) a GALA orLexA DNA binding domain (DBD) or a VP16 or B42 acidic activatortransactivation domain (AD). The reporter constructs include a reportergene, luciferase or LacZ, operably linked to a synthetic promoterconstruct that comprises either a GALA response element or a LexAresponse element to which the Gal4 DBD or LexA DBD binds, respectively.Various combinations of these receptor and reporter constructs werecotransfected into mammalian cells as described in Examples 2-6 infra.

[0272] Gene Expression Cassettes: Ecdysone receptor-based geneexpression cassette pairs (switches) were constructed as followed, usingstandard cloning methods available in the art. The following is briefdescription of preparation and composition of each switch used in theExamples described herein.

[0273] 1.1-GAL4CfEcR-CDEF/VP16MmRXRα-EF: The C, D, E, and F domains fromspruce budworm Choristoneura fumiferana EcR (“CfEcR-CDEF”; SEQ ID NO:59) were fused to a GAL4 DNA binding domain (“Gal4DNABD” or “Gal4DBD”;SEQ ID NO: 47) and placed under the control of an SV40e promoter (SEQ IDNO: 60). The EF domains from mouse Mus musculus RXRα (“MmRXRα-EF”; SEQID NO: 9) were fused to the transactivation domain from VP16 (“VP16AD”;SEQ ID NO: 51) and placed under the control of an SV40e promoter (SEQ IDNO: 60). Five consensus GAL4 response element binding sites (“5XGAL4RE”;comprising 5 copies of a GAL4RE comprising SEQ ID NO: 55) were fused toa synthetic E1b minimal promoter (SEQ ID NO: 61) and placed upstream ofthe luciferase gene (SEQ ID NO: 62).

[0274] 1.2-Gal4CfEcR-CDEF/VP16LmUSP-EF: This construct was prepared inthe same way as in switch 1.1 above except MmRXRα-EF was replaced withthe EF domains from Locusta migratoria ultraspiracle (“LmUSP-EF”; SEQ IDNO: 21).

[0275] 1.3-Gal4CfEcR-CDEF/VP16 (RXRα(1-7)-LmUSP(8-12)-EF: This constructwas prepared in the same way as in switch 1.1 above except MmRXRα-EF wasreplaced with helices 1 through 7 of MmRXRα-EF and helices 8 through 12of LmUSP-EF (SEQ ID NO: 45).

[0276] 1.4-Gal4CfEcR-CDEF/VP16MmRXRα(1-7)-LmUSP(8-12)-EF-MmRXRα-F: Thisconstruct was prepared in the same way as in switch 1. I above exceptMmRXRα-EF was replaced with helices 1 through 7 of MmRXRα-EF and helices8 through 12 of LmUSP-EF (SEQ ID NO: 45), and wherein the lastC-terminal 18 nucleotides of SEQ ID NO: 45 (F domain) were replaced withthe F domain of MmRXRα(“MmRXRα-F”, SEQ ID NO: 63).

[0277] 1.5-Gal4CfEcR-CDEF/VP16MmRXRα(1-12)-EF-LmUSP-F: This constructwas prepared in the same way as in switch 1.1 above except MmRXRα-EF wasreplaced with helices 1 through 12 of MmRXRα-EF (SEQ ID NO: 9) andwherein the last C-terminal 18 nucleotides of SEQ ID NO: 9 (F domain)were replaced with the F domain of LmUSP (“LmUSP-F”, SEQ ID NO: 64).

[0278] 1.6-Gal4CfEcR-CDEF/VP16LmUSP(1-12)-EF-MmRXRα-F: This constructwas prepared in the same way as in switch 1.1 above except MmRXRα-EF wasreplaced with helices 1 through 12 of LmUSP-EF (SEQ ID NO: 21) andwherein the last C-terminal 18 nucleotides of SEQ ID NO: 21 (F domain)were replaced with the F domain of MmRXRα(“MmRXRα-F”, SEQ ID NO: 63).

[0279] 1.7-GAL4CfEcR-DEF/VP16CfUSP-EF: The D, E, and F domains fromspruce budworm Choristoneura fumiferana EcR (“CfEcR-DEF”; SEQ ID NO: 65)were fused to a GAL4 DNA binding domain (“Gal4DNABD” or “Gal4DBD”; SEQID NO: 47) and placed under the control of an SV40e promoter (SEQ ID NO:60). The EF domains from C. fumiferana USP (“CfUSP-EF”; SEQ ID NO: 66)were fused to the transactivation domain from VP16 (“VP16AD”; SEQ ID NO:51) and placed under the control of an SV40e promoter (SEQ ID NO: 60).Five consensus GAL4 response element binding sites (“5XGAL4RE”;comprising 5 copies of a GAL4RE comprising SEQ ID NO: 55) were fused toa synthetic E1b minimal promoter (SEQ ID NO: 61) and placed upstream ofthe luciferase gene (SEQ ID NO: 62).

[0280] 1.8-GAL4CfEcR-DEF/VP16DmUSP-EF: This construct was prepared inthe same way as in switch 1.7 above except CfUSP-EF was replaced withthe corresponding EF domains from fruit fly Drosophila melanogaster USP(“DmUSP-EF”, SEQ ID NO: 75).

[0281] 1.9-Gal4CfEcR-DEF/VP16LmUSP-EF: This construct was prepared inthe same way as in switch 1.7 above except CfUSP-EF was replaced withthe EF domains from Locusta migratoria USP (“LmUSP-EF”; SEQ ID NO: 21).

[0282] 1.10-GAL4CfEcR-DEF/VP16MmRXRα-EF: This construct was prepared inthe same way as in switch 1.7 above except CfUSP-EF was replaced withthe EF domains of M. musculus MmRXRα(“MmRXRα-EF”, SEQ ID NO: 9).

[0283] 1.11-GAL4CfEcR-DEF/VP16AmaRXR1-EF: This construct was prepared inthe same way as in switch 1.7 above except CfUSP-EF was replaced withthe EF domains of tick Amblyomma americanum RXR homolog 1 (“AmaRXR1-EF”,SEQ ID NO: 22).

[0284] 1.12-GAL4CfEcR-DEF/VP16AmaRXR2-EF: This construct was prepared inthe same way as in switch 1.7 above except CfUSP-EF was replaced withthe EF domains of tick A. americanum RXR homolog 2 (“AmaRXR2-EF”, SEQ IDNO: 23).

[0285] 1.13-Gal4CfEcR-DEF/VP16MmRXRα(1-7)-LmUSP(8-12)-EF (“αChimera#7”):This construct was prepared in the same way as in switch 1.7 aboveexcept CfUSP-EF was replaced with helices I through 7 of MmRXRα-EF andhelices 8 through 12 of LmUSP-EF (SEQ ID NO: 45).

[0286] 1.14-GAL4DmEcR-DEF/VP16CfUSP-EF: The D, E, and F domains fromfruit fly Drosophila melanogaster EcR (“DmEcR-DEF”; SEQ ID NO: 67) werefused to a GAL4 DNA binding domain (“Gal4DNABD” or “Gal4DBD”; SEQ ID NO:47) and placed under the control of an SV40e promoter (SEQ ID NO: 60).The EF domains from C. fumiferana USP (“CfUSP-EF”; SEQ ID NO: 66) werefused to the transactivation domain from VP16 (“VP16AD”; SEQ ID NO: 51)and placed under the control of an SV40e promoter (SEQ ID NO: 60). Fiveconsensus GAL4 response element binding sites (“5XGAL4RE”; comprising 5copies of a GAL4RE comprising SEQ ID NO: 55) were fused to a syntheticE1b minimal promoter (SEQ ID NO: 61) and placed upstream of theluciferase gene (SEQ ID NO: 62).

[0287] 1.15-GAL4DmEcR-DEF/VP16DmUSP-EF: This construct was prepared inthe same way as in switch 1.14 above except CfUSP-EF was replaced withthe corresponding EF domains from fruit fly Drosophila melanogaster USP(“DmUSP-EF”, SEQ ID NO: 75).

[0288] 1.16-Gal4DmEcR-DEF/VP16LmUSP-EF: This construct was prepared inthe same way as in switch 1.14 above except CfUSP-EF was replaced withthe EF domains from Locusta migratoria USP (“LmUSP-EF”; SEQ ID NO: 21).

[0289] 1.17-GAL4DmEcR-DEF/VP16MmRXRα-EF: This construct was prepared inthe same way as in switch 1.14 above except CfUSP-EF was replaced withthe EF domains of Mus musculus MmRXRα (“MmRXRα-EF”, SEQ ID NO: 9).

[0290] 1.18-GALA4DmEcR-DEF/VP16AmaRXR1-EF: This construct was preparedin the same way as in switch 1.14 above except CfUSP-EF was replacedwith the EF domains of ixodid tick Amblyomma americanum RXR homolog 1(“AmaRXR1-EF”, SEQ ID NO: 22).

[0291] 1.19-GAL4DmEcR-DEF/VP16AmaRXR2-EF: This construct was prepared inthe same way as in switch 1.14 above except CfUSP-EF was replaced withthe EF domains of ixodid tick A. americanum RXR homolog 2 (“AmaRXR2-EF”,SEQ ID NO: 23).

[0292] 1.20-Gal4DmEcR-DEF/VP16MmRXRα(1-7)-LmUSP(8-12)-EF: This constructwas prepared in the same way as in switch 1.14 above except CfUSP-EF wasreplaced with helices 1 through 7 of MmRXRα-EF and helices 8 through 12of LmUSP-EF (SEQ ID NO: 45).

[0293] 1.21-GAL4TmEcR-DEF/VP16MmRXRα(1-7)-LmUSP(8-12)-EF: This constructwas prepared in the same way as in switch 1.20 above except DmEcR-DEFwas replaced with the corresponding D, E, and F domains from beetleTenebrio molitor EcR (“TmEcR-DEF”, SEQ ID NO: 71), fused to a GAL4 DNAbinding domain (“Gal4DNABD” or “Gal4DBD”; SEQ ID NO: 47) and placedunder the control of an SV40e promoter (SEQ ID NO: 60). Chimeric EFdomains comprising helices 1 through 7 MmRXRα-EF and helices 8 through12 of LmUSP-EF (SEQ ID NO: 45) were fused to the transactivation domainfrom VP16 (“VP16AD”; SEQ ID NO: 51) and placed under the control of anSV40e promoter (SEQ ID NO: 60). Five consensus GAL4 response elementbinding sites (“5XGAL4RE”; comprising 5 copies of a GAL4RE comprisingSEQ ID NO: 55) were fused to a synthetic E1b minimal promoter (SEQ IDNO: 61) and placed upstream of the luciferase gene (SEQ ID NO: 62).

[0294] 1.22-Gal4AmaEcR-DEF/VP16MmRXRα(1-7)-LmUSP(8-12)-EF: Thisconstruct was prepared in the same way as in switch 1.21 above exceptTmEcR-DEF was replaced with the corresponding DEF domains of tickAmblyomma americanum EcR (“AmaEcR-DEF”, SEQ ID NO: 73).

[0295] 1.23-GAL4CfEcR-CDEF/VP16HsRXRβ-EF: The C, D, E, and F domainsfrom spruce budworm Choristoneura fumiferana EcR (“CfEcR-CDEF”; SEQ IDNO: 59) were fused to a GAL4 DNA binding domain (“Gal4DNABD” or“Gal4DBD”; SEQ ID NO: 47) and placed under the control of an SV40epromoter (SEQ ID NO: 60). The EF domains from human Homo sapiens RXRβ(“HsRXRβ-EF”; SEQ ID NO: 13) were fused to the transactivation domainfrom VP16 (“VP16AD”; SEQ ID NO: 51) and placed under the control of anSV40e promoter (SEQ ID NO: 60). Five consensus GALA response elementbinding sites (“5XGAL4RE”; comprising 5 copies of a GAL4RE comprisingSEQ ID NO: 55) were fused to a synthetic E1b minimal promoter (SEQ IDNO: 61) and placed upstream of the luciferase gene (SEQ ID NO: 62).

[0296] 1.24-GAL4CfEcR-DEF/VP16HsRXRβ-EF: This construct was prepared inthe same way as in switch 1.23 above except CfEcR-CDEF was replaced withthe DEF domains of C fumiferana EcR (“CfEcR-DEF”; SEQ ID NO: 65).

[0297]1.25-GAL4CfEcR-DEF/VP16HsRXRβ(1-6)-LmUSP(7-12)-EF (“βChimera#6”):This construct was prepared in the same way as in switch 1.24 aboveexcept HsRXRβ-EF was replaced with helices 1 through 6 of HsRXRβ-EF(nucleotides 1-348 of SEQ ID NO: 13) and helices 7 through 12 ofLmUSP-EF (nucleotides 268-630 of SEQ ID NO: 21).

[0298] 1.26-GAL4CfEcR-DEF/VP16HsRXRβ(1-7)-LmUSP(8-12)-EF (“βChimera#8”):This construct was prepared in the same way as in switch 1.24 aboveexcept HsRXRβ-EF was replaced with helices 1 through 7 of HsRXRβ-EF(nucleotides 1-408 of SEQ ID NO: 13) and helices 8 through 12 ofLmUSP-EF (nucleotides 337-630 of SEQ ID NO: 21).

[0299] 1.27-GAL4CfEcR-DEF/VP16HsRXRβ(1-8)-LmUSP(9-12)-EF (“βChimera#9”):This construct was prepared in the same way as in switch 1.24 aboveexcept HsRXRβ-EF was replaced with helices 1 through 8 of HsRXRβ-EF(nucleotides 1-465 of SEQ ID NO: 13) and helices 9 through 12 ofLmUSP-EF (nucleotides 403-630 of SEQ ID NO: 21).

[0300] 1.28-GAL4CfEcR-DEF/VP16HsRXRβ(1-9)-LmUSP(10-12)-EF(“βChimera#10”): This construct was prepared in the same way as inswitch 1.24 above except HsRXRβ-EF was replaced with helices 1 through 9of HsRXRβ-EF (nucleotides 1-555 of SEQ ID NO: 13) and helices 10 through12 of LmUSP-EF (nucleotides 490-630 of SEQ ID NO: 21).

[0301] 1.29-GAL4CfEcR-DEF/VP16HsRXRβ(1-10)-LmUSP(11-12)-EF(“β3Chimera#11”): This construct was prepared in the same way as inswitch 1.24 above except HsRXRβ-EF was replaced with helices 1 through10 of HsRXRβ-EF (nucleotides 1-624 of SEQ ID NO: 13) and helices 11through 12 of LmUSP-EF (nucleotides 547-630 of SEQ ID -NO: 21).

[0302] 1.30-GAL4DmEcR-DEF/VP16HsRXRβ(1-6)-LmUSP(7-12)-EF (“βChimera#6”):This construct was prepared in the same way as in switch 1.25 aboveexcept CfEcR-DEF was replaced with DmEcR-DEF (SEQ ID NO: 67).

[0303] 1.31-GAL4DmEcR-DEF/VP16HsRXRβ(1-7)-LmUSP(8-12)-EF (“βChimera#8”):This construct was prepared in the same way as in switch 1.26 aboveexcept CfEcR-DEF was replaced with DmEcR-DEF (SEQ ID NO: 67).

[0304] 1.32-GAL4DmEcR-DEF/VP16HsRXRβ(1-8)-LmUSP(9-12)-EF (“βChimera#9”):This construct was prepared in the same way as in switch 1.27 aboveexcept CfEcR-DEF was replaced with DmEcR-DEF (SEQ ID NO: 67).

[0305] 1.33-GAL4DmEcR-DEF/VP16HsRXRβ(1-9)-LmUSP(10-12)-EF(“βChimera#10”): This construct was prepared in the same way as inswitch 1.28 above CfEcR-DEF was replaced with DmEcR-DEF (SEQ ID NO: 67).

[0306] 1.34-GAL4DmEcR-DEF/VP16HsRXRβ(1-10)-LmUSP(11-12)-EF(“βChimera#11”): This construct was prepared in the same way as inswitch 1.29 above except CfEcR-DEF was replaced with DmEcR-DEF (SEQ IDNO: 67).

Example 2

[0307] Applicants have recently made the surprising discovery thatinvertebrate RXRs and their non-Lepidopteran and non-Dipteran RXRhomologs can function similarly to or better than vertebrate RXRs in anecdysone receptor-based inducible gene expression modulation system inboth yeast and mammalian cells (U.S. provisional application serial No.60/294,814). Indeed, Applicants have demonstrated that LmUSP is a betterpartner for CfEcR than mouse RXR in mammalian cells. Yet for most geneexpression system applications, particularly those destined formammalian cells, it is desirable to have a vertebrate RXR as a partner.To identify a minimum region of LmUSP required for this improvement,Applicants have constructed and analyzed vertebrate RXR/invertebrate RXRchimeras (referred to herein interchangeably as “chimeric RXR's” or “RXRchimeras”) in an EcR-based inducible gene expression modulation system.Briefly, gene induction potential (magnitude of induction) and ligandspecificity and sensitivity were examined using a non-steroidal ligandin a dose-dependent induction of reporter gene expression in thetransfected NIH3T3 cells and A549 cells.

[0308] In the first set of RXR chimeras, helices 8 to 12 from MmRXRα-EFwere replaced with helices 8 to 12 from LmUSP-EF (switch 1.3 as preparedin Example 1). Three independent clones (RXR chimeras Ch#1, Ch#2, andCh#3 in FIGS. 1-3) were picked and compared with the parental MmRXRα-EFand LmUSP-EF switches (switches 1.1 and 1.2, respectively, as preparedin Example 1). The RXR chimera and parent DNAs were transfected intomouse NIH3T3 cells along with Gal4/CfEcR-CDEF and the reporter plasmidpFRLuc. The transfected cells were grown in the presence of 0, 0.2, 1,5, and 10 μM non-steroidal ligandN-(2-ethyl-3-methoxybenzoyl)-N′-(3,5-dimethylbenzoyl)-N′-tert-butylhydrazine(GS-E™ ligand). The cells were harvested at 48 hours post treatment andthe reporter activity was assayed. The numbers on top of bars correspondto the maximum fold activation/induction for that treatment.

[0309] Transfections: DNAs corresponding to the various switchconstructs outlined in Example 1, specifically switches 1.1 through 1.6were transfected into mouse NIH3T3 cells (ATCC) and human A549 cells(ATCC) as follows. Cells were harvested when they reached 50% confluencyand plated in 6-, 12- or 24- well plates at 125,000, 50,000, or 25,000cells, respectively, in 2.5, 1.0, or 0.5 ml of growth medium containing10% fetal bovine serum (FBS), respectively. NIH3T3 cells were grown inDulbecco's modified Eagle medium (DMEM; LifeTechnologies) and A549 cellswere grown in F12K nutrient mixture (LifeTechnologies). The next day,the cells were rinsed with growth medium and transfected for four hours.Superfect™ (Qiagen Inc.) was found to be the best transfection reagentfor 3T3 cells and A549 cells. For 12- well plates, 4 μl of Superfect™was mixed with 100 μl of growth medium 1.0 μg of reporter construct and0.25 jig of each receptor construct of the receptor pair to be analyzedwere added to the transfection mix. A second reporter construct wasadded [pTKRL (Promega), 0.1 μg/transfection mix] that comprises aRenilla luciferase gene operably linked and placed under the control ofa thymidine kinase (TK) constitutive promoter and was used fornormalization. The contents of the transfection mix were mixed in avortex mixer and let stand at room temperature for 30 min. At the end ofincubation, the transfection mix was added to the cells maintained in400 μl growth medium. The cells were maintained at 37° C. and 5% CO₂ forfour hours. At the end of incubation, 500 μl of growth medium containing20% FBS and either dimethylsulfoxide (DMSO; control) or a DMSO solutionof 0.2, 1, 5, 10, and 50 μMN-(2-ethyl-3-methoxybenzoyl)N′-(3,5-dimethylbenzoyl)-N′-tert-butylhydrazinenon-steroidal ligand was added and the cells were maintained at 37° C.and 5% CO₂ for 48 hours. The cells were harvested and reporter activitywas assayed. The same procedure was followed for 6 and 24 well plates aswell except all the reagents were doubled for 6 well plates and reducedto half for 24-well plates.

[0310] Ligand: The non-steroidal ligandN-(2-ethyl-3-methoxybenzoyl)-N′-3,5-methylbenzoyl)-N′-t-butylhydrazine(GS™-E non-steroidal ligand) is a synthetic stable ecdysteroid ligandsynthesized at Rohm and Haas Company. Ligands were dissolved in DMSO andthe final concentration of DMSO was maintained at 0.1% in both controlsand treatments.

[0311] Reporter Assays: Cells were harvested 48 hours after addingligands. 125, 250, or 500 μl of passive lysis buffer (part ofDual-luciferase™ reporter assay system from Promega Corporation) wereadded to each well of 24- or 12- or 6-well plates respectively. Theplates were placed on a rotary shaker for 15 minutes. Twenty μl oflysate were assayed. Luciferase activity was measured usingDual-luciferase™ reporter assay system from Promega Corporationfollowing the manufacturer's instructions. β-Galactosidase was measuredusing Galacto-Star™ assay kit from TROPIX following the manufacturer'sinstructions. All luciferase and β-galactosidase activities werenormalized using Renilla luciferase as a standard. Fold activities werecalculated by dividing normalized relative light units (“RLU”) in ligandtreated cells with normalized RLU in DMSO treated cells (untreatedcontrol).

[0312] Results: Surprisingly, all three independent clones of the RXRchimera tested (switch 1.3) were better than either parent-based switch,MmRXRα-EF (switch 1.1) and LmUSP-EF (switch 1.2), see FIG. 1. Inparticular, the chimeric RXR demonstrated increased ligand sensitivityand increased magnitude of induction. Thus, Applicants have made thesurprising discovery that a chimeric RXR ligand binding domain may beused in place of a vertebrate RXR or an invertebrate RXR in an EcR-basedinducible gene expression modulation system. This novel EcR/chimericRXR-based gene expression system provides an improved systemcharacterized by both increased ligand sensitivity and increasedmagnitude of induction.

[0313] The best two RXR chimeras clones of switch 1.3 (“Ch#1” and “Ch#2”of FIG. 2) were compared with the parent-based switches 1.1 and 1.2 in arepeated experiment (“Chim-1” and “Chim-2” in FIG. 2, respectively). Inthis experiment, the chimeric RXR-based switch was again more sensitiveto non-steroidal ligand than either parent-based switch (see FIG. 2).However, in this experiment, the chimeric RXR-based switch was betterthan the vertebrate RXR (MmRXRα-EF)-based switch for magnitude ofinduction but was similar to the invertebrate RXR (LmUSP-EF)-basedswitch.

[0314] The same chimeric RXR- and parent RXR-based switches were alsoexamined in a human lung carcinoma cell line A549 (ATCC) and similarresults were observed (FIG. 3).

[0315] Thus, Applicants have demonstrated for the first time that achimeric RXR ligand binding domain can function effectively inpartnership with an ecdysone receptor in an inducible gene expressionsystem in mammalian cells. Surprisingly, the EcR/chimeric RXR-basedinducible gene expression system of the present invention is animprovement over both the EcR/vertebrate RXR- and EcR/invertebrateRXR-based gene expression modulation systems since less ligand isrequired for transactivation and increased levels of transactivation canbe achieved.

[0316] Based upon Applicant's discovery described herein, one ofordinary skill in the art is able to predict that other chimeric RXRligand binding domain comprising at least two different species RXRpolypeptide fragments from a vertebrate RXR LBD, an invertebrate RXRLBD, or a non-Dipteran and non-Lepidopteran invertebrate RXR homologwill also function in Applicants' EcR/chimeric RXR-based inducible geneexpression system. Based upon Applicants' invention, the means to makeadditional chimeric RXR LBD embodiments within the scope of the presentinvention is within the art and no undue experimentation is necessary.Indeed, one of skill in the art can routinely clone and sequence apolynucleotide encoding a vertebrate or invertebrate RXR or RXR homologLBD, and based upon sequence homology analyses similar to that presentedin FIG. 4, and determine the corresponding polynucleotide andpolypeptide fragments of that particular species RXR LBD that areencompassed within the scope of the present invention.

[0317] One of ordinary skill in the art is also able to predict thatApplicants' novel inducible gene expression system will also work tomodulate gene expression in yeast cells. Since the Dipteran RXRhomolog/and Lepidopteran RXR homolog/EcR-based gene expression systemsfunction constitutively in yeast cells (data not shown), similar to howthey function in mammalian cells, and non-Dipteran and non-Lepidopteraninvertebrate RXRs function inducibly in partnership with an EcR inmammalian cells, the EcR/chimeric RXR-based inducible gene expressionmodulation system is predicted to function inducibly in yeast cells,similar to how it functions in mammalian cells. Thus, the EcR/chimericRXR inducible gene expression system of the present invention is usefulin applications where modulation of gene expression levels is desired inboth yeast and mammalian cells. Furthermore, Applicants' invention isalso contemplated to work in other cells, including but not limited tobacterial cells, fungal cells, and animal cells.

Example 3

[0318] There are six amino acids in the C-terminal end of the LBD thatare different between MmRXRα and LmUSP (see sequence alignmentspresented in FIG. 4). To verify if these six amino acids contribute tothe differences observed between MmRXRα and LmUSP transactivationabilities, Applicants constructed RXR chimeras in which the C-terminalsix amino acids, designated herein as the F domain, of one parent RXRwere substituted for the F domain of the other parent RXR. Gene switchescomprising LmUSP-EF fused to MmRXRα-F (VP16/LmUSP-EF-MmRXRα-F, switch1.6), MmRXRα-EF fused to LmUSP-F (VP16/MmRXRαEF-LmUSP-F, switch 1.5),and MmRXRα-EF(1-7)-LmUSP-EF(8-12) fused to MmRXRα-F (Chimera/RXR-F,switch 1.4) were constructed as described in Example 1. These constructswere transfected in NIH3T3 cells and transactivation potential wasassayed in the presence of 0, 0.2, 1, and 10 μMN-(2-ethyl-3-methoxybenzoyl)N′-(3,5-dimethylbenzoyl)-N′-tert-butylhydrazinenon-steroidal ligand. The F-domain chimeras (gene switches 1.4-1.6) werecompared to the MmRXRα-EF(1-7)-LmUSP-EF(8-12) chimeric RXR LBD of geneswitch 1.3. Plasmid pFRLUC (Stratagene) encoding a luciferasepolypeptide was used as a reporter gene construct and pTKRL (Promega)encoding a Renilla luciferase polypeptide under the control of theconstitutive TK promoter was used to normalize the transfections asdescribed above. The cells were harvested, lysed and luciferase reporteractivity was measured in the cell lysates. Total fly luciferase relativelight units are presented. The number on the top of each bar is themaximum fold induction for that treatment. The analysis was performed intriplicate and mean luciferase counts [total relative light units (RLU)]were determined as described above.

[0319] As shown in FIG. 5, the six amino acids in the C-terminal end ofthe LBD (F domain) do not appear to account for the differences observedbetween vertebrate RXR and invertebrate RXR transactivation abilities,suggesting that helices 8-12 of the EF domain are most likelyresponsible for these differences between vertebrate and invertebrateRXRs.

Example 4

[0320] This Example describes the construction of four EcR-DEF-basedgene switches comprising the DEF domains from Choristoneura fumiferana(Lepidoptera), Drosophila melanogaster (Diptera), Tenebrio molitor(Coleoptera), and Amblyomma americanum (Ixodidae) fused to a GAL4 DNAbinding domain. In addition, the EF domains of vertebrate RXRs,invertebrate RXRs, or invertebrate USPs from Choristoneura fumiferanaUSP, Drosophila melanogaster USP, Locusta migratoria USP (Orthoptera),Mus musculus RXRα (Vertebrata), a chimera between MmRXRα and LmUSP(Chimera; of switch 1.13), Amblyomma americanum RXR homolog 1(Ixodidae), Amblyomma americanum RXR homolog 2 (Ixodidae) were fused toa VP16 activation domain. The receptor combinations were compared fortheir ability to transactivate the reporter plasmid pFRLuc in mouseNIH3T3 cells in the presence of 0, 0.2, 1, or 10 μM PonA steroidalligand (Sigma Chemical Company) or 0, 0.04, 0.2, 1, or 10 μMN-(2-ethyl-3-methoxybenzoyl)N′-(3,5-dimethylbenzoyl)-N′-tert-butylhydrazinenon-steroidal ligand as described above. The cells were harvested, lysedand luciferase reporter activity was measured in the cell lysates. Totalfly luciferase relative light units are presented. The number on the topof each bar is the maximum fold induction for that treatment. Theanalysis was performed in triplicate and mean luciferase counts [totalrelative light units (RLU)] were determined as described above.

[0321] FIGS. 6-8 show the results of these analyses. The MmRXR-LmUSPchimera was the best partner for CfEcR (11,000 fold induction, FIG. 6),DmEcR (1759 fold induction, FIG. 7). For all other EcRs tested, the RXRchimera produced higher background levels in the absence of ligand (seeFIG. 8). The CfEcR/chimeric RXR-based switch (switch 1.13) was moresensitive to non-steroid than PonA whereas, the DmEcR/chimeric RXR-basedswitch (switch 1.20) was more sensitive to PonA than non-steroid. Sincethese two switch formats produce decent levels of induction and showdifferential sensitivity to steroids and non-steroids, these may beexploited for applications in which two or more gene switches aredesired.

[0322] Except for CfEcR, all other EcRs tested in partnership thechimeric RXR are more sensitive to steroids than to non-steroids. TheTmEcR/chimeric RXR-based switch (switch 1.21; FIG. 8) is more sensitiveto PonA and less sensitive to non-steroid and works best when partneredwith either MmRXRα, AmaRXR1, or AmaRXR2. The AmaEcR/chimeric RXR-basedswitch (switch 1.22; FIG. 8) is also more sensitive to PonA and lesssensitive to non-steroid and works best when partnered with either anLmUSP, MmRXR, AmaRXR1 or AmaRXR2-based gene expression cassette. Thus,TmEcR/and AmaEcR/chimeric RXR-based gene switches appear to form a groupof ecdysone receptors that is different from lepidopteran and dipteranEcR/chimeric RXR-based gene switches group (CfEcR/chimeric RXR andDmEcR/chimeric RXR, respectively). As noted above, the differentialligand sensitivities of Applicants' EcR/chimeric RXR-based gene switchesare advantageous for use in applications in which two or more geneswitches are desired.

Example 5

[0323] This Example describes Applicants' further analysis of geneexpression cassettes encoding various chimeric RXR polypeptidescomprising a mouse RXRα isoform polypeptide fragment or a human RXRβisoform polypeptide fragment and an LmUSP polypeptide fragment in mouseNIH3T3 cells. These RXR chimeras were constructed in an effort toidentify the helix or helices of the EF domain that account for theobserved transactivational differences between vertebrate andinvertebrate RXRs. Briefly, five different gene expression cassettesencoding a chimeric RXR ligand binding domain were constructed asdescribed in Example 1. The five chimeric RXR ligand binding domainsencoded by these gene expression cassettes and the respective vertebrateRXR and invertebrate RXR fragments they comprise are depicted inTable 1. TABLE 1 HsRXRβ/LmUSP EF Domain Chimeric RXRs HsRXRβ-EF LmUSP-EFPolypeptide Polypeptide Chimera Name Fragment(s) Fragment(s) β Chimera#6 Helices 1-6   Helices 7-12 β Chimera #8 Helices 1-7   Helices 8-12 βChumera #9 Helices 1-8   Helices 9-12 β Chimera #10 Helices 1-9  Helices10-12 β Chimera #11 Helices 1-10 Helices 11-12

[0324] Three individual clones of each chimeric RXR LBD of Table 1 weretransfected into mouse NIH3T3 cells along with either GAL4CfEcR-DEF(switches 1.25-1.29 of Example 1; FIGS. 9 and 10) or GAL4DmEcR-DEF(switches 1.30-1.34 of Example 1; FIG. 11) and the reporter plasmidpFRLuc as described above. The transfected cells were cultured in thepresence of either a) 0, 0.2, 1, or 10 μM non-steroidal ligand (FIG. 9),or b) 0, 0.2, 1, or 10 μM steroid ligand PonA or 0, 0.4, 0.2, 1, or 10μM non-steroid ligand (FIGS. 10 and 11) for 48 hours. The reporter geneactivity was measured and total RLU are shown. The number on top of eachbar is the maximum fold induction for that treatment and is the mean ofthree replicates.

[0325] As shown in FIG. 9, the best results were obtained when anHsRXRβH1-8 and LmUSP H9-12 chimeric RXR ligand binding domain (of switch1.27) was used, indicating that helix 9 of LmUSP may be responsible forsensitivity and magnitude of induction.

[0326] Using CfEcR as a partner, chimera 9 demonstrated maximuminduction (see FIG. 10). Chimeras 6 and 8 also produced good inductionand lower background, as a result the fold induction was higher forthese two chimeras when compared to chimera 9. Chimeras 10 and 11produced lower levels of reporter activity.

[0327] Using DmEcR as a partner, chimera 8 produced the reporteractivity (see FIG. 11). Chimera 9 also performed well, whereas chimeras6, 10 and 11 demonstrated lower levels of reporter activity.

[0328] The selection of a particular chimeric RXR ligand binding domaincan also influence the performance EcR in response to a particularligand. Specifically, CfEcR in combination with chimera 11 respondedwell to non-steroid but not to PonA (see FIG. 10). Conversely, DmEcR incombination with chimera 11 responded well to PonA but not tonon-steroid (see FIG. 11).

Example 6

[0329] This Example demonstrates the effect of introduction of a secondligand into the host cell comprising an EcR/chimeric RXR-based induciblegene expression modulation system of the invention. In particular,Applicants have determined the effect of 9-cis-retinoic acid on thetransactivation potential of theGAL4CfEcR-DEF/VP16HsRXRβ-(1-8)-LmUSP-(9-12)-EF (βchimera 9; switch 1.27)gene switch along with pFRLuc in NIH 3T3 cells in the presence ofnon-steroid (GSE) for 48 hours.

[0330] Briefly, GAL4CfEcR-DEF, pFRLuc andVP16HsRXRβ-(1-8)-LmUSP-(9-12)-EF (chimera #9) were transfected intoNIH3T3 cells and the transfected cells were treated with 0, 0.04, 0.2,1, 5 and 25 μM non-steroidal ligand (GSE) and 0, 1, 5 and 25 μM9-cis-retinoic acid (Sigma Chemical Company). The reporter activity wasmeasured at 48 hours after adding ligands.

[0331] As shown in FIG. 12, the presence of retinoic acid increased thesensitivity of CfEcR-DEF to non-steroidal ligand. At a non-steroidligand concentration of 0.04 μM, there is very little induction in theabsence of 9-cis-retinoic acid, but when 1 μM 9-cis-retinoic acid isadded in addition to 0.04 μM non-steroid, induction is greatlyincreased.

1 75 1 735 DNA Choristoneura fumiferana 1 taccaggacg ggtacgagcagccttctgat gaagatttga agaggattac gcagacgtgg 60 cagcaagcgg acgatgaaaacgaagagtct gacactccct tccgccagat cacagagatg 120 actatcctca cggtccaacttatcgtggag ttcgcgaagg gattgccagg gttcgccaag 180 atctcgcagc ctgatcaaattacgctgctt aaggcttgct caagtgaggt aatgatgctc 240 cgagtcgcgc gacgatacgatgcggcctca gacagtgttc tgttcgcgaa caaccaagcg 300 tacactcgcg acaactaccgcaaggctggc atggcctacg tcatcgagga tctactgcac 360 ttctgccggt gcatgtactctatggcgttg gacaacatcc attacgcgct gctcacggct 420 gtcgtcatct tttctgaccggccagggttg gagcagccgc aactggtgga agaaatccag 480 cggtactacc tgaatacgctccgcatctat atcctgaacc agctgagcgg gtcggcgcgt 540 tcgtccgtca tatacggcaagatcctctca atcctctctg agctacgcac gctcggcatg 600 caaaactcca acatgtgcatctccctcaag ctcaagaaca gaaagctgcc gcctttcctc 660 gaggagatct gggatgtggcggacatgtcg cacacccaac cgccgcctat cctcgagtcc 720 cccacgaatc tctag 735 21338 DNA Drosophila melanogaster 2 tatgagcagc catctgaaga ggatctcaggcgtataatga gtcaacccga tgagaacgag 60 agccaaacgg acgtcagctt tcggcatataaccgagataa ccatactcac ggtccagttg 120 attgttgagt ttgctaaagg tctaccagcgtttacaaaga taccccagga ggaccagatc 180 acgttactaa aggcctgctc gtcggaggtgatgatgctgc gtatggcacg acgctatgac 240 cacagctcgg actcaatatt cttcgcgaataatagatcat atacgcggga ttcttacaaa 300 atggccggaa tggctgataa cattgaagacctgctgcatt tctgccgcca aatgttctcg 360 atgaaggtgg acaacgtcga atacgcgcttctcactgcca ttgtgatctt ctcggaccgg 420 ccgggcctgg agaaggccca actagtcgaagcgatccaga gctactacat cgacacgcta 480 cgcatttata tactcaaccg ccactgcggcgactcaatga gcctcgtctt ctacgcaaag 540 ctgctctcga tcctcaccga gctgcgtacgctgggcaacc agaacgccga gatgtgtttc 600 tcactaaagc tcaaaaaccg caaactgcccaagttcctcg aggagatctg ggacgttcat 660 gccatcccgc catcggtcca gtcgcaccttcagattaccc aggaggagaa cgagcgtctc 720 gagcgggctg agcgtatgcg ggcatcggttgggggcgcca ttaccgccgg cattgattgc 780 gactctgcct ccacttcggc ggcggcagccgcggcccagc atcagcctca gcctcagccc 840 cagccccaac cctcctccct gacccagaacgattcccagc accagacaca gccgcagcta 900 caacctcagc taccacctca gctgcaaggtcaactgcaac cccagctcca accacagctt 960 cagacgcaac tccagccaca gattcaaccacagccacagc tccttcccgt ctccgctccc 1020 gtgcccgcct ccgtaaccgc acctggttccttgtccgcgg tcagtacgag cagcgaatac 1080 atgggcggaa gtgcggccat aggacccatcacgccggcaa ccaccagcag tatcacggct 1140 gccgttaccg ctagctccac cacatcagcggtaccgatgg gcaacggagt tggagtcggt 1200 gttggggtgg gcggcaacgt cagcatgtatgcgaacgccc agacggcgat ggccttgatg 1260 ggtgtagccc tgcattcgca ccaagagcagcttatcgggg gagtggcggt taagtcggag 1320 cactcgacga ctgcatag 1338 3 960 DNAChoristoneura fumiferana 3 cctgagtgcg tagtacccga gactcagtgc gccatgaagcggaaagagaa gaaagcacag 60 aaggagaagg acaaactgcc tgtcagcacg acgacggtggacgaccacat gccgcccatt 120 atgcagtgtg aacctccacc tcctgaagca gcaaggattcacgaagtggt cccaaggttt 180 ctctccgaca agctgttgga gacaaaccgg cagaaaaacatcccccagtt gacagccaac 240 cagcagttcc ttatcgccag gctcatctgg taccaggacgggtacgagca gccttctgat 300 gaagatttga agaggattac gcagacgtgg cagcaagcggacgatgaaaa cgaagagtct 360 gacactccct tccgccagat cacagagatg actatcctcacggtccaact tatcgtggag 420 ttcgcgaagg gattgccagg gttcgccaag atctcgcagcctgatcaaat tacgctgctt 480 aaggcttgct caagtgaggt aatgatgctc cgagtcgcgcgacgatacga tgcggcctca 540 gacagtgttc tgttcgcgaa caaccaagcg tacactcgcgacaactaccg caaggctggc 600 atggcctacg tcatcgagga tctactgcac ttctgccggtgcatgtactc tatggcgttg 660 gacaacatcc attacgcgct gctcacggct gtcgtcatcttttctgaccg gccagggttg 720 gagcagccgc aactggtgga agaaatccag cggtactacctgaatacgct ccgcatctat 780 atcctgaacc agctgagcgg gtcggcgcgt tcgtccgtcatatacggcaa gatcctctca 840 atcctctctg agctacgcac gctcggcatg caaaactccaacatgtgcat ctccctcaag 900 ctcaagaaca gaaagctgcc gcctttcctc gaggagatctgggatgtggc ggacatgtcg 960 4 969 DNA Drosophila melanogaster 4 cggccggaatgcgtcgtccc ggagaaccaa tgtgcgatga agcggcgcga aaagaaggcc 60 cagaaggagaaggacaaaat gaccacttcg ccgagctctc agcatggcgg caatggcagc 120 ttggcctctggtggcggcca agactttgtt aagaaggaga ttcttgacct tatgacatgc 180 gagccgccccagcatgccac tattccgcta ctacctgatg aaatattggc caagtgtcaa 240 gcgcgcaatataccttcctt aacgtacaat cagttggccg ttatatacaa gttaatttgg 300 taccaggatggctatgagca gccatctgaa gaggatctca ggcgtataat gagtcaaccc 360 gatgagaacgagagccaaac ggacgtcagc tttcggcata taaccgagat aaccatactc 420 acggtccagttgattgttga gtttgctaaa ggtctaccag cgtttacaaa gataccccag 480 gaggaccagatcacgttact aaaggcctgc tcgtcggagg tgatgatgct gcgtatggca 540 cgacgctatgaccacagctc ggactcaata ttcttcgcga ataatagatc atatacgcgg 600 gattcttacaaaatggccgg aatggctgat aacattgaag acctgctgca tttctgccgc 660 caaatgttctcgatgaaggt ggacaacgtc gaatacgcgc ttctcactgc cattgtgatc 720 ttctcggaccggccgggcct ggagaaggcc caactagtcg aagcgatcca gagctactac 780 atcgacacgctacgcattta tatactcaac cgccactgcg gcgactcaat gagcctcgtc 840 ttctacgcaaagctgctctc gatcctcacc gagctgcgta cgctgggcaa ccagaacgcc 900 gagatgtgtttctcactaaa gctcaaaaac cgcaaactgc ccaagttcct cgaggagatc 960 tgggacgtt 9695 244 PRT Choristoneura fumiferana 5 Tyr Gln Asp Gly Tyr Glu Gln Pro SerAsp Glu Asp Leu Lys Arg Ile 1 5 10 15 Thr Gln Thr Trp Gln Gln Ala AspAsp Glu Asn Glu Glu Ser Asp Thr 20 25 30 Pro Phe Arg Gln Ile Thr Glu MetThr Ile Leu Thr Val Gln Leu Ile 35 40 45 Val Glu Phe Ala Lys Gly Leu ProGly Phe Ala Lys Ile Ser Gln Pro 50 55 60 Asp Gln Ile Thr Leu Leu Lys AlaCys Ser Ser Glu Val Met Met Leu 65 70 75 80 Arg Val Ala Arg Arg Tyr AspAla Ala Ser Asp Ser Val Leu Phe Ala 85 90 95 Asn Asn Gln Ala Tyr Thr ArgAsp Asn Tyr Arg Lys Ala Gly Met Ala 100 105 110 Tyr Val Ile Glu Asp LeuLeu His Phe Cys Arg Cys Met Tyr Ser Met 115 120 125 Ala Leu Asp Asn IleHis Tyr Ala Leu Leu Thr Ala Val Val Ile Phe 130 135 140 Ser Asp Arg ProGly Leu Glu Gln Pro Gln Leu Val Glu Glu Ile Gln 145 150 155 160 Arg TyrTyr Leu Asn Thr Leu Arg Ile Tyr Ile Leu Asn Gln Leu Ser 165 170 175 GlySer Ala Arg Ser Ser Val Ile Tyr Gly Lys Ile Leu Ser Ile Leu 180 185 190Ser Glu Leu Arg Thr Leu Gly Met Gln Asn Ser Asn Met Cys Ile Ser 195 200205 Leu Lys Leu Lys Asn Arg Lys Leu Pro Pro Phe Leu Glu Glu Ile Trp 210215 220 Asp Val Ala Asp Met Ser His Thr Gln Pro Pro Pro Ile Leu Glu Ser225 230 235 240 Pro Thr Asn Leu 6 445 PRT Drosophila melanogaster 6 TyrGlu Gln Pro Ser Glu Glu Asp Leu Arg Arg Ile Met Ser Gln Pro 1 5 10 15Asp Glu Asn Glu Ser Gln Thr Asp Val Ser Phe Arg His Ile Thr Glu 20 25 30Ile Thr Ile Leu Thr Val Gln Leu Ile Val Glu Phe Ala Lys Gly Leu 35 40 45Pro Ala Phe Thr Lys Ile Pro Gln Glu Asp Gln Ile Thr Leu Leu Lys 50 55 60Ala Cys Ser Ser Glu Val Met Met Leu Arg Met Ala Arg Arg Tyr Asp 65 70 7580 His Ser Ser Asp Ser Ile Phe Phe Ala Asn Asn Arg Ser Tyr Thr Arg 85 9095 Asp Ser Tyr Lys Met Ala Gly Met Ala Asp Asn Ile Glu Asp Leu Leu 100105 110 His Phe Cys Arg Gln Met Phe Ser Met Lys Val Asp Asn Val Glu Tyr115 120 125 Ala Leu Leu Thr Ala Ile Val Ile Phe Ser Asp Arg Pro Gly LeuGlu 130 135 140 Lys Ala Gln Leu Val Glu Ala Ile Gln Ser Tyr Tyr Ile AspThr Leu 145 150 155 160 Arg Ile Tyr Ile Leu Asn Arg His Cys Gly Asp SerMet Ser Leu Val 165 170 175 Phe Tyr Ala Lys Leu Leu Ser Ile Leu Thr GluLeu Arg Thr Leu Gly 180 185 190 Asn Gln Asn Ala Glu Met Cys Phe Ser LeuLys Leu Lys Asn Arg Lys 195 200 205 Leu Pro Lys Phe Leu Glu Glu Ile TrpAsp Val His Ala Ile Pro Pro 210 215 220 Ser Val Gln Ser His Leu Gln IleThr Gln Glu Glu Asn Glu Arg Leu 225 230 235 240 Glu Arg Ala Glu Arg MetArg Ala Ser Val Gly Gly Ala Ile Thr Ala 245 250 255 Gly Ile Asp Cys AspSer Ala Ser Thr Ser Ala Ala Ala Ala Ala Ala 260 265 270 Gln His Gln ProGln Pro Gln Pro Gln Pro Gln Pro Ser Ser Leu Thr 275 280 285 Gln Asn AspSer Gln His Gln Thr Gln Pro Gln Leu Gln Pro Gln Leu 290 295 300 Pro ProGln Leu Gln Gly Gln Leu Gln Pro Gln Leu Gln Pro Gln Leu 305 310 315 320Gln Thr Gln Leu Gln Pro Gln Ile Gln Pro Gln Pro Gln Leu Leu Pro 325 330335 Val Ser Ala Pro Val Pro Ala Ser Val Thr Ala Pro Gly Ser Leu Ser 340345 350 Ala Val Ser Thr Ser Ser Glu Tyr Met Gly Gly Ser Ala Ala Ile Gly355 360 365 Pro Ile Thr Pro Ala Thr Thr Ser Ser Ile Thr Ala Ala Val ThrAla 370 375 380 Ser Ser Thr Thr Ser Ala Val Pro Met Gly Asn Gly Val GlyVal Gly 385 390 395 400 Val Gly Val Gly Gly Asn Val Ser Met Tyr Ala AsnAla Gln Thr Ala 405 410 415 Met Ala Leu Met Gly Val Ala Leu His Ser HisGln Glu Gln Leu Ile 420 425 430 Gly Gly Val Ala Val Lys Ser Glu His SerThr Thr Ala 435 440 445 7 320 PRT Choristoneura fumiferana 7 Pro Glu CysVal Val Pro Glu Thr Gln Cys Ala Met Lys Arg Lys Glu 1 5 10 15 Lys LysAla Gln Lys Glu Lys Asp Lys Leu Pro Val Ser Thr Thr Thr 20 25 30 Val AspAsp His Met Pro Pro Ile Met Gln Cys Glu Pro Pro Pro Pro 35 40 45 Glu AlaAla Arg Ile His Glu Val Val Pro Arg Phe Leu Ser Asp Lys 50 55 60 Leu LeuGlu Thr Asn Arg Gln Lys Asn Ile Pro Gln Leu Thr Ala Asn 65 70 75 80 GlnGln Phe Leu Ile Ala Arg Leu Ile Trp Tyr Gln Asp Gly Tyr Glu 85 90 95 GlnPro Ser Asp Glu Asp Leu Lys Arg Ile Thr Gln Thr Trp Gln Gln 100 105 110Ala Asp Asp Glu Asn Glu Glu Ser Asp Thr Pro Phe Arg Gln Ile Thr 115 120125 Glu Met Thr Ile Leu Thr Val Gln Leu Ile Val Glu Phe Ala Lys Gly 130135 140 Leu Pro Gly Phe Ala Lys Ile Ser Gln Pro Asp Gln Ile Thr Leu Leu145 150 155 160 Lys Ala Cys Ser Ser Glu Val Met Met Leu Arg Val Ala ArgArg Tyr 165 170 175 Asp Ala Ala Ser Asp Ser Val Leu Phe Ala Asn Asn GlnAla Tyr Thr 180 185 190 Arg Asp Asn Tyr Arg Lys Ala Gly Met Ala Tyr ValIle Glu Asp Leu 195 200 205 Leu His Phe Cys Arg Cys Met Tyr Ser Met AlaLeu Asp Asn Ile His 210 215 220 Tyr Ala Leu Leu Thr Ala Val Val Ile PheSer Asp Arg Pro Gly Leu 225 230 235 240 Glu Gln Pro Gln Leu Val Glu GluIle Gln Arg Tyr Tyr Leu Asn Thr 245 250 255 Leu Arg Ile Tyr Ile Leu AsnGln Leu Ser Gly Ser Ala Arg Ser Ser 260 265 270 Val Ile Tyr Gly Lys IleLeu Ser Ile Leu Ser Glu Leu Arg Thr Leu 275 280 285 Gly Met Gln Asn SerAsn Met Cys Ile Ser Leu Lys Leu Lys Asn Arg 290 295 300 Lys Leu Pro ProPhe Leu Glu Glu Ile Trp Asp Val Ala Asp Met Ser 305 310 315 320 8 323PRT Drosophila melanogaster 8 Arg Pro Glu Cys Val Val Pro Glu Asn GlnCys Ala Met Lys Arg Arg 1 5 10 15 Glu Lys Lys Ala Gln Lys Glu Lys AspLys Met Thr Thr Ser Pro Ser 20 25 30 Ser Gln His Gly Gly Asn Gly Ser LeuAla Ser Gly Gly Gly Gln Asp 35 40 45 Phe Val Lys Lys Glu Ile Leu Asp LeuMet Thr Cys Glu Pro Pro Gln 50 55 60 His Ala Thr Ile Pro Leu Leu Pro AspGlu Ile Leu Ala Lys Cys Gln 65 70 75 80 Ala Arg Asn Ile Pro Ser Leu ThrTyr Asn Gln Leu Ala Val Ile Tyr 85 90 95 Lys Leu Ile Trp Tyr Gln Asp GlyTyr Glu Gln Pro Ser Glu Glu Asp 100 105 110 Leu Arg Arg Ile Met Ser GlnPro Asp Glu Asn Glu Ser Gln Thr Asp 115 120 125 Val Ser Phe Arg His IleThr Glu Ile Thr Ile Leu Thr Val Gln Leu 130 135 140 Ile Val Glu Phe AlaLys Gly Leu Pro Ala Phe Thr Lys Ile Pro Gln 145 150 155 160 Glu Asp GlnIle Thr Leu Leu Lys Ala Cys Ser Ser Glu Val Met Met 165 170 175 Leu ArgMet Ala Arg Arg Tyr Asp His Ser Ser Asp Ser Ile Phe Phe 180 185 190 AlaAsn Asn Arg Ser Tyr Thr Arg Asp Ser Tyr Lys Met Ala Gly Met 195 200 205Ala Asp Asn Ile Glu Asp Leu Leu His Phe Cys Arg Gln Met Phe Ser 210 215220 Met Lys Val Asp Asn Val Glu Tyr Ala Leu Leu Thr Ala Ile Val Ile 225230 235 240 Phe Ser Asp Arg Pro Gly Leu Glu Lys Ala Gln Leu Val Glu AlaIle 245 250 255 Gln Ser Tyr Tyr Ile Asp Thr Leu Arg Ile Tyr Ile Leu AsnArg His 260 265 270 Cys Gly Asp Ser Met Ser Leu Val Phe Tyr Ala Lys LeuLeu Ser Ile 275 280 285 Leu Thr Glu Leu Arg Thr Leu Gly Asn Gln Asn AlaGlu Met Cys Phe 290 295 300 Ser Leu Lys Leu Lys Asn Arg Lys Leu Pro LysPhe Leu Glu Glu Ile 305 310 315 320 Trp Asp Val 9 714 DNA Mus musculus 9gccaacgagg acatgcctgt agagaagatt ctggaagccg agcttgctgt cgagcccaag 60actgagacat acgtggaggc aaacatgggg ctgaacccca gctcaccaaa tgaccctgtt 120accaacatct gtcaagcagc agacaagcag ctcttcactc ttgtggagtg ggccaagagg 180atcccacact tttctgagct gcccctagac gaccaggtca tcctgctacg ggcaggctgg 240aacgagctgc tgatcgcctc cttctcccac cgctccatag ctgtgaaaga tgggattctc 300ctggccaccg gcctgcacgt acaccggaac agcgctcaca gtgctggggt gggcgccatc 360tttgacaggg tgctaacaga gctggtgtct aagatgcgtg acatgcagat ggacaagacg 420gagctgggct gcctgcgagc cattgtcctg ttcaaccctg actctaaggg gctctcaaac 480cctgctgagg tggaggcgtt gagggagaag gtgtatgcgt cactagaagc gtactgcaaa 540cacaagtacc ctgagcagcc gggcaggttt gccaagctgc tgctccgcct gcctgcactg 600cgttccatcg ggctcaagtg cctggagcac ctgttcttct tcaagctcat cggggacacg 660cccatcgaca ccttcctcat ggagatgctg gaggcaccac atcaagccac ctag 714 10 720DNA Mus musculus 10 gcccctgagg agatgcctgt ggacaggatc ctggaggcagagcttgctgt ggagcagaag 60 agtgaccaag gcgttgaggg tcctggggcc accgggggtggtggcagcag cccaaatgac 120 ccagtgacta acatctgcca ggcagctgac aaacagctgttcacactcgt tgagtgggca 180 aagaggatcc cgcacttctc ctccctacct ctggacgatcaggtcatact gctgcgggca 240 ggctggaacg agctcctcat tgcgtccttc tcccatcggtccattgatgt ccgagatggc 300 atcctcctgg ccacgggtct tcatgtgcac agaaactcagcccattccgc aggcgtggga 360 gccatctttg atcgggtgct gacagagcta gtgtccaaaatgcgtgacat gaggatggac 420 aagacagagc ttggctgcct gcgggcaatc atcatgtttaatccagacgc caagggcctc 480 tccaaccctg gagaggtgga gatccttcgg gagaaggtgtacgcctcact ggagacctat 540 tgcaagcaga agtaccctga gcagcagggc cggtttgccaagctgctgtt acgtcttcct 600 gccctccgct ccatcggcct caagtgtctg gagcacctgttcttcttcaa gctcattggc 660 gacaccccca ttgacacctt cctcatggag atgcttgaggctccccacca gctagcctga 720 11 705 DNA Mus musculus 11 agccacgaagacatgcccgt ggagaggatt ctagaagccg aacttgctgt ggaaccaaag 60 acagaatcctacggtgacat gaacgtggag aactcaacaa atgaccctgt taccaacata 120 tgccatgctgcagataagca acttttcacc ctcgttgagt gggccaaacg catcccccac 180 ttctcagatctcaccttgga ggaccaggtc attctactcc gggcagggtg gaatgaactg 240 ctcattgcctccttctccca ccgctcggtt tccgtccagg atggcatcct gctggccacg 300 ggcctccacgtgcacaggag cagcgctcac agccggggag tcggctccat cttcgacaga 360 gtccttacagagttggtgtc caagatgaaa gacatgcaga tggataagtc agagctgggg 420 tgcctacgggccatcgtgct gtttaaccca gatgccaagg gtttatccaa cccctctgag 480 gtggagactcttcgagagaa ggtttatgcc accctggagg cctataccaa gcagaagtat 540 ccggaacagccaggcaggtt tgccaagctt ctgctgcgtc tccctgctct gcgctccatc 600 ggcttgaaatgcctggaaca cctcttcttc ttcaagctca ttggagacac tcccatcgac 660 agcttcctcatggagatgtt ggagacccca ctgcagatca cctga 705 12 850 DNA Homo sapiens 12gccaacgagg acatgccggt ggagaggatc ctggaggctg agctggccgt ggagcccaag 60accgagacct acgtggaggc aaacatgggg ctgaacccca gctcgccgaa cgaccctgtc 120accaacattt gccaagcagc cgacaaacag cttttcaccc tggtggagtg ggccaagcgg 180atcccacact tctcagagct gcccctggac gaccaggtca tcctgctgcg ggcaggctgg 240aatgagctgc tcatcgcctc cttctcccac cgctccatcg ccgtgaagga cgggatcctc 300ctggccaccg ggctgcacgt ccaccggaac agcgcccaca gcgcaggggt gggcgccatc 360tttgacaggg tgctgacgga gcttgtgtcc aagatgcggg acatgcagat ggacaagacg 420gagctgggct gcctgcgcgc catcgtcctc tttaaccctg actccaaggg gctctcgaac 480ccggccgagg tggaggcgct gagggagaag gtctatgcgt ccttggaggc ctactgcaag 540cacaagtacc cagagcagcc gggaaggttc gctaagctct tgctccgcct gccggctctg 600cgctccatcg ggctcaaatg cctggaacat ctcttcttct tcaagctcat cggggacaca 660cccattgaca ccttccttat ggagatgctg gaggcgccgc accaaatgac ttaggcctgc 720gggcccatcc tttgtgccca cccgttctgg ccaccctgcc tggacgccag ctgttcttct 780cagcctgagc cctgtccctg cccttctctg cctggcctgt ttggactttg gggcacagcc 840tgtcactgct 850 13 720 DNA Homo sapiens 13 gcccccgagg agatgcctgtggacaggatc ctggaggcag agcttgctgt ggaacagaag 60 agtgaccagg gcgttgagggtcctggggga accgggggta gcggcagcag cccaaatgac 120 cctgtgacta acatctgtcaggcagctgac aaacagctat tcacgcttgt tgagtgggcg 180 aagaggatcc cacacttttcctccttgcct ctggatgatc aggtcatatt gctgcgggca 240 ggctggaatg aactcctcattgcctccttt tcacaccgat ccattgatgt tcgagatggc 300 atcctccttg ccacaggtcttcacgtgcac cgcaactcag cccattcagc aggagtagga 360 gccatctttg atcgggtgctgacagagcta gtgtccaaaa tgcgtgacat gaggatggac 420 aagacagagc ttggctgcctgagggcaatc attctgttta atccagatgc caagggcctc 480 tccaacccta gtgaggtggaggtcctgcgg gagaaagtgt atgcatcact ggagacctac 540 tgcaaacaga agtaccctgagcagcaggga cggtttgcca agctgctgct acgtcttcct 600 gccctccggt ccattggccttaagtgtcta gagcatctgt ttttcttcaa gctcattggt 660 gacaccccca tcgacaccttcctcatggag atgcttgagg ctccccatca actggcctga 720 14 705 DNA Homo sapiens14 ggtcatgaag acatgcctgt ggagaggatt ctagaagctg aacttgctgt tgaaccaaag 60acagaatcct atggtgacat gaatatggag aactcgacaa atgaccctgt taccaacata 120tgtcatgctg ctgacaagca gcttttcacc ctcgttgaat gggccaagcg tattccccac 180ttctctgacc tcaccttgga ggaccaggtc attttgcttc gggcagggtg gaatgaattg 240ctgattgcct ctttctccca ccgctcagtt tccgtgcagg atggcatcct tctggccacg 300ggtttacatg tccaccggag cagtgcccac agtgctgggg tcggctccat ctttgacaga 360gttctaactg agctggtttc caaaatgaaa gacatgcaga tggacaagtc ggaactggga 420tgcctgcgag ccattgtact ctttaaccca gatgccaagg gcctgtccaa cccctctgag 480gtggagactc tgcgagagaa ggtttatgcc acccttgagg cctacaccaa gcagaagtat 540ccggaacagc caggcaggtt tgccaagctg ctgctgcgcc tcccagctct gcgttccatt 600ggcttgaaat gcctggagca cctcttcttc ttcaagctca tcggggacac ccccattgac 660accttcctca tggagatgtt ggagaccccg ctgcagatca cctga 705 15 237 PRT Musmusculus 15 Ala Asn Glu Asp Met Pro Val Glu Lys Ile Leu Glu Ala Glu LeuAla 1 5 10 15 Val Glu Pro Lys Thr Glu Thr Tyr Val Glu Ala Asn Met GlyLeu Asn 20 25 30 Pro Ser Ser Pro Asn Asp Pro Val Thr Asn Ile Cys Gln AlaAla Asp 35 40 45 Lys Gln Leu Phe Thr Leu Val Glu Trp Ala Lys Arg Ile ProHis Phe 50 55 60 Ser Glu Leu Pro Leu Asp Asp Gln Val Ile Leu Leu Arg AlaGly Trp 65 70 75 80 Asn Glu Leu Leu Ile Ala Ser Phe Ser His Arg Ser IleAla Val Lys 85 90 95 Asp Gly Ile Leu Leu Ala Thr Gly Leu His Val His ArgAsn Ser Ala 100 105 110 His Ser Ala Gly Val Gly Ala Ile Phe Asp Arg ValLeu Thr Glu Leu 115 120 125 Val Ser Lys Met Arg Asp Met Gln Met Asp LysThr Glu Leu Gly Cys 130 135 140 Leu Arg Ala Ile Val Leu Phe Asn Pro AspSer Lys Gly Leu Ser Asn 145 150 155 160 Pro Ala Glu Val Glu Ala Leu ArgGlu Lys Val Tyr Ala Ser Leu Glu 165 170 175 Ala Tyr Cys Lys His Lys TyrPro Glu Gln Pro Gly Arg Phe Ala Lys 180 185 190 Leu Leu Leu Arg Leu ProAla Leu Arg Ser Ile Gly Leu Lys Cys Leu 195 200 205 Glu His Leu Phe PhePhe Lys Leu Ile Gly Asp Thr Pro Ile Asp Thr 210 215 220 Phe Leu Met GluMet Leu Glu Ala Pro His Gln Ala Thr 225 230 235 16 239 PRT Mus musculus16 Ala Pro Glu Glu Met Pro Val Asp Arg Ile Leu Glu Ala Glu Leu Ala 1 510 15 Val Glu Gln Lys Ser Asp Gln Gly Val Glu Gly Pro Gly Ala Thr Gly 2025 30 Gly Gly Gly Ser Ser Pro Asn Asp Pro Val Thr Asn Ile Cys Gln Ala 3540 45 Ala Asp Lys Gln Leu Phe Thr Leu Val Glu Trp Ala Lys Arg Ile Pro 5055 60 His Phe Ser Ser Leu Pro Leu Asp Asp Gln Val Ile Leu Leu Arg Ala 6570 75 80 Gly Trp Asn Glu Leu Leu Ile Ala Ser Phe Ser His Arg Ser Ile Asp85 90 95 Val Arg Asp Gly Ile Leu Leu Ala Thr Gly Leu His Val His Arg Asn100 105 110 Ser Ala His Ser Ala Gly Val Gly Ala Ile Phe Asp Arg Val LeuThr 115 120 125 Glu Leu Val Ser Lys Met Arg Asp Met Arg Met Asp Lys ThrGlu Leu 130 135 140 Gly Cys Leu Arg Ala Ile Ile Met Phe Asn Pro Asp AlaLys Gly Leu 145 150 155 160 Ser Asn Pro Gly Glu Val Glu Ile Leu Arg GluLys Val Tyr Ala Ser 165 170 175 Leu Glu Thr Tyr Cys Lys Gln Lys Tyr ProGlu Gln Gln Gly Arg Phe 180 185 190 Ala Lys Leu Leu Leu Arg Leu Pro AlaLeu Arg Ser Ile Gly Leu Lys 195 200 205 Cys Leu Glu His Leu Phe Phe PheLys Leu Ile Gly Asp Thr Pro Ile 210 215 220 Asp Thr Phe Leu Met Glu MetLeu Glu Ala Pro His Gln Leu Ala 225 230 235 17 234 PRT Mus musculus 17Ser His Glu Asp Met Pro Val Glu Arg Ile Leu Glu Ala Glu Leu Ala 1 5 1015 Val Glu Pro Lys Thr Glu Ser Tyr Gly Asp Met Asn Val Glu Asn Ser 20 2530 Thr Asn Asp Pro Val Thr Asn Ile Cys His Ala Ala Asp Lys Gln Leu 35 4045 Phe Thr Leu Val Glu Trp Ala Lys Arg Ile Pro His Phe Ser Asp Leu 50 5560 Thr Leu Glu Asp Gln Val Ile Leu Leu Arg Ala Gly Trp Asn Glu Leu 65 7075 80 Leu Ile Ala Ser Phe Ser His Arg Ser Val Ser Val Gln Asp Gly Ile 8590 95 Leu Leu Ala Thr Gly Leu His Val His Arg Ser Ser Ala His Ser Arg100 105 110 Gly Val Gly Ser Ile Phe Asp Arg Val Leu Thr Glu Leu Val SerLys 115 120 125 Met Lys Asp Met Gln Met Asp Lys Ser Glu Leu Gly Cys LeuArg Ala 130 135 140 Ile Val Leu Phe Asn Pro Asp Ala Lys Gly Leu Ser AsnPro Ser Glu 145 150 155 160 Val Glu Thr Leu Arg Glu Lys Val Tyr Ala ThrLeu Glu Ala Tyr Thr 165 170 175 Lys Gln Lys Tyr Pro Glu Gln Pro Gly ArgPhe Ala Lys Leu Leu Leu 180 185 190 Arg Leu Pro Ala Leu Arg Ser Ile GlyLeu Lys Cys Leu Glu His Leu 195 200 205 Phe Phe Phe Lys Leu Ile Gly AspThr Pro Ile Asp Ser Phe Leu Met 210 215 220 Glu Met Leu Glu Thr Pro LeuGln Ile Thr 225 230 18 237 PRT Homo sapiens 18 Ala Asn Glu Asp Met ProVal Glu Arg Ile Leu Glu Ala Glu Leu Ala 1 5 10 15 Val Glu Pro Lys ThrGlu Thr Tyr Val Glu Ala Asn Met Gly Leu Asn 20 25 30 Pro Ser Ser Pro AsnAsp Pro Val Thr Asn Ile Cys Gln Ala Ala Asp 35 40 45 Lys Gln Leu Phe ThrLeu Val Glu Trp Ala Lys Arg Ile Pro His Phe 50 55 60 Ser Glu Leu Pro LeuAsp Asp Gln Val Ile Leu Leu Arg Ala Gly Trp 65 70 75 80 Asn Glu Leu LeuIle Ala Ser Phe Ser His Arg Ser Ile Ala Val Lys 85 90 95 Asp Gly Ile LeuLeu Ala Thr Gly Leu His Val His Arg Asn Ser Ala 100 105 110 His Ser AlaGly Val Gly Ala Ile Phe Asp Arg Val Leu Thr Glu Leu 115 120 125 Val SerLys Met Arg Asp Met Gln Met Asp Lys Thr Glu Leu Gly Cys 130 135 140 LeuArg Ala Ile Val Leu Phe Asn Pro Asp Ser Lys Gly Leu Ser Asn 145 150 155160 Pro Ala Glu Val Glu Ala Leu Arg Glu Lys Val Tyr Ala Ser Leu Glu 165170 175 Ala Tyr Cys Lys His Lys Tyr Pro Glu Gln Pro Gly Arg Phe Ala Lys180 185 190 Leu Leu Leu Arg Leu Pro Ala Leu Arg Ser Ile Gly Leu Lys CysLeu 195 200 205 Glu His Leu Phe Phe Phe Lys Leu Ile Gly Asp Thr Pro IleAsp Thr 210 215 220 Phe Leu Met Glu Met Leu Glu Ala Pro His Gln Met Thr225 230 235 19 239 PRT Homo sapiens 19 Ala Pro Glu Glu Met Pro Val AspArg Ile Leu Glu Ala Glu Leu Ala 1 5 10 15 Val Glu Gln Lys Ser Asp GlnGly Val Glu Gly Pro Gly Gly Thr Gly 20 25 30 Gly Ser Gly Ser Ser Pro AsnAsp Pro Val Thr Asn Ile Cys Gln Ala 35 40 45 Ala Asp Lys Gln Leu Phe ThrLeu Val Glu Trp Ala Lys Arg Ile Pro 50 55 60 His Phe Ser Ser Leu Pro LeuAsp Asp Gln Val Ile Leu Leu Arg Ala 65 70 75 80 Gly Trp Asn Glu Leu LeuIle Ala Ser Phe Ser His Arg Ser Ile Asp 85 90 95 Val Arg Asp Gly Ile LeuLeu Ala Thr Gly Leu His Val His Arg Asn 100 105 110 Ser Ala His Ser AlaGly Val Gly Ala Ile Phe Asp Arg Val Leu Thr 115 120 125 Glu Leu Val SerLys Met Arg Asp Met Arg Met Asp Lys Thr Glu Leu 130 135 140 Gly Cys LeuArg Ala Ile Ile Leu Phe Asn Pro Asp Ala Lys Gly Leu 145 150 155 160 SerAsn Pro Ser Glu Val Glu Val Leu Arg Glu Lys Val Tyr Ala Ser 165 170 175Leu Glu Thr Tyr Cys Lys Gln Lys Tyr Pro Glu Gln Gln Gly Arg Phe 180 185190 Ala Lys Leu Leu Leu Arg Leu Pro Ala Leu Arg Ser Ile Gly Leu Lys 195200 205 Cys Leu Glu His Leu Phe Phe Phe Lys Leu Ile Gly Asp Thr Pro Ile210 215 220 Asp Thr Phe Leu Met Glu Met Leu Glu Ala Pro His Gln Leu Ala225 230 235 20 234 PRT Homo sapiens 20 Gly His Glu Asp Met Pro Val GluArg Ile Leu Glu Ala Glu Leu Ala 1 5 10 15 Val Glu Pro Lys Thr Glu SerTyr Gly Asp Met Asn Met Glu Asn Ser 20 25 30 Thr Asn Asp Pro Val Thr AsnIle Cys His Ala Ala Asp Lys Gln Leu 35 40 45 Phe Thr Leu Val Glu Trp AlaLys Arg Ile Pro His Phe Ser Asp Leu 50 55 60 Thr Leu Glu Asp Gln Val IleLeu Leu Arg Ala Gly Trp Asn Glu Leu 65 70 75 80 Leu Ile Ala Ser Phe SerHis Arg Ser Val Ser Val Gln Asp Gly Ile 85 90 95 Leu Leu Ala Thr Gly LeuHis Val His Arg Ser Ser Ala His Ser Ala 100 105 110 Gly Val Gly Ser IlePhe Asp Arg Val Leu Thr Glu Leu Val Ser Lys 115 120 125 Met Lys Asp MetGln Met Asp Lys Ser Glu Leu Gly Cys Leu Arg Ala 130 135 140 Ile Val LeuPhe Asn Pro Asp Ala Lys Gly Leu Ser Asn Pro Ser Glu 145 150 155 160 ValGlu Thr Leu Arg Glu Lys Val Tyr Ala Thr Leu Glu Ala Tyr Thr 165 170 175Lys Gln Lys Tyr Pro Glu Gln Pro Gly Arg Phe Ala Lys Leu Leu Leu 180 185190 Arg Leu Pro Ala Leu Arg Ser Ile Gly Leu Lys Cys Leu Glu His Leu 195200 205 Phe Phe Phe Lys Leu Ile Gly Asp Thr Pro Ile Asp Thr Phe Leu Met210 215 220 Glu Met Leu Glu Thr Pro Leu Gln Ile Thr 225 230 21 635 DNALocusta migratoria 21 tgcatacaga catgcctgtt gaacgcatac ttgaagctgaaaaacgagtg gagtgcaaag 60 cagaaaacca agtggaatat gagctggtgg agtgggctaaacacatcccg cacttcacat 120 ccctacctct ggaggaccag gttctcctcc tcagagcaggttggaatgaa ctgctaattg 180 cagcattttc acatcgatct gtagatgtta aagatggcatagtacttgcc actggtctca 240 cagtgcatcg aaattctgcc catcaagctg gagtcggcacaatatttgac agagttttga 300 cagaactggt agcaaagatg agagaaatga aaatggataaaactgaactt ggctgcttgc 360 gatctgttat tcttttcaat ccagaggtga ggggtttgaaatccgcccag gaagttgaac 420 ttctacgtga aaaagtatat gccgctttgg aagaatatactagaacaaca catcccgatg 480 aaccaggaag atttgcaaaa cttttgcttc gtctgccttctttacgttcc ataggcctta 540 agtgtttgga gcatttgttt ttctttcgcc ttattggagatgttccaatt gatacgttcc 600 tgatggagat gcttgaatca ccttctgatt cataa 635 22687 DNA Amblyomma americanum 22 cctcctgaga tgcctctgga gcgcatactggaggcagagc tgcgggttga gtcacagacg 60 gggaccctct cggaaagcgc acagcagcaggatccagtga gcagcatctg ccaagctgca 120 gaccgacagc tgcaccagct agttcaatgggccaagcaca ttccacattt tgaagagctt 180 ccccttgagg accgcatggt gttgctcaaggctggctgga acgagctgct cattgctgct 240 ttctcccacc gttctgttga cgtgcgtgatggcattgtgc tcgctacagg tcttgtggtg 300 cagcggcata gtgctcatgg ggctggcgttggggccatat ttgatagggt tctcactgaa 360 ctggtagcaa agatgcgtga gatgaagatggaccgcactg agcttggatg cctgcttgct 420 gtggtacttt ttaatcctga ggccaaggggctgcggacct gcccaagtgg aggccctgag 480 ggagaaagtg tatctgcctt ggaagagcactgccggcagc agtacccaga ccagcctggg 540 cgctttgcca agctgctgct gcggttgccagctctgcgca gtattggcct caagtgcctc 600 gaacatctct ttttcttcaa gctcatcggggacacgccca tcgacaactt tcttctttcc 660 atgctggagg ccccctctga cccctaa 68723 693 DNA Amblyomma americanum 23 tctccggaca tgccactcga acgcattctcgaagccgaga tgcgcgtcga gcagccggca 60 ccgtccgttt tggcgcagac ggccgcatcgggccgcgacc ccgtcaacag catgtgccag 120 gctgccccgc cacttcacga gctcgtacagtgggcccggc gaattccgca cttcgaagag 180 cttcccatcg aggatcgcac cgcgctgctcaaagccggct ggaacgaact gcttattgcc 240 gccttttcgc accgttctgt ggcggtgcgcgacggcatcg ttctggccac cgggctggtg 300 gtgcagcggc acagcgcaca cggcgcaggcgttggcgaca tcttcgaccg cgtactagcc 360 gagctggtgg ccaagatgcg cgacatgaagatggacaaaa cggagctcgg ctgcctgcgc 420 gccgtggtgc tcttcaatcc agacgccaagggtctccgaa acgccaccag agtagaggcg 480 ctccgcgaga aggtgtatgc ggcgctggaggagcactgcc gtcggcacca cccggaccaa 540 ccgggtcgct tcggcaagct gctgctgcggctgcctgcct tgcgcagcat cgggctcaaa 600 tgcctcgagc atctgttctt cttcaagctcatcggagaca ctcccataga cagcttcctg 660 ctcaacatgc tggaggcacc ggcagacccctag 693 24 801 DNA Celuca pugilator 24 tcagacatgc caattgccag catacgggaggcagagctca gcgtggatcc catagatgag 60 cagccgctgg accaaggggt gaggcttcaggttccactcg cacctcctga tagtgaaaag 120 tgtagcttta ctttaccttt tcatcccgtcagtgaagtat cctgtgctaa ccctctgcag 180 gatgtggtga gcaacatatg ccaggcagctgacagacatc tggtgcagct ggtggagtgg 240 gccaagcaca tcccacactt cacagaccttcccatagagg accaagtggt attactcaaa 300 gccgggtgga acgagttgct tattgcctcattctcacacc gtagcatggg cgtggaggat 360 ggcatcgtgc tggccacagg gctcgtgatccacagaagta gtgctcacca ggctggagtg 420 ggtgccatat ttgatcgtgt cctctctgagctggtggcca agatgaagga gatgaagatt 480 gacaagacag agctgggctg ccttcgctccatcgtcctgt tcaacccaga tgccaaagga 540 ctaaactgcg tcaatgatgt ggagatcttgcgtgagaagg tgtatgctgc cctggaggag 600 tacacacgaa ccacttaccc tgatgaacctggacgctttg ccaagttgct tctgcgactt 660 cctgcactca ggtctatagg cctgaagtgtcttgagtacc tcttcctgtt taagctgatt 720 ggagacactc ccctggacag ctacttgatgaagatgctcg tagacaaccc aaatacaagc 780 gtcactcccc ccaccagcta g 801 25 690DNA Tenebrio molitor 25 gccgagatgc ccctcgacag gataatcgag gcggagaaacggatagaatg cacacccgct 60 ggtggctctg gtggtgtcgg agagcaacac gacggggtgaacaacatctg tcaagccact 120 aacaagcagc tgttccaact ggtgcaatgg gctaagctcatacctcactt tacctcgttg 180 ccgatgtcgg accaggtgct tttattgagg gcaggatggaatgaattgct catcgccgca 240 ttctcgcaca gatctataca ggcgcaggat gccatcgttctagccacggg gttgacagtt 300 aacaaaacgt cggcgcacgc cgtgggcgtg ggcaacatctacgaccgcgt cctctccgag 360 ctggtgaaca agatgaaaga gatgaagatg gacaagacggagctgggctg cttgagagcc 420 atcatcctct acaaccccac gtgtcgcggc atcaagtccgtgcaggaagt ggagatgctg 480 cgtgagaaaa tttacggcgt gctggaagag tacaccaggaccacccaccc gaacgagccc 540 ggcaggttcg ccaaactgct tctgcgcctc ccggccctcaggtccatcgg gttgaaatgt 600 tccgaacacc tctttttctt caagctgatc ggtgatgttccaatagacac gttcctgatg 660 gagatgctgg agtctccggc ggacgcttag 690 26 681DNA Apis mellifera 26 cattcggaca tgccgatcga gcgtatcctg gaggccgagaagagagtcga atgtaagatg 60 gagcaacagg gaaattacga gaatgcagtg tcgcacatttgcaacgccac gaacaaacag 120 ctgttccagc tggtagcatg ggcgaaacac atcccgcattttacctcgtt gccactggag 180 gatcaggtac ttctgctcag ggccggttgg aacgagttgctgatagcctc cttttcccac 240 cgttccatcg acgtgaagga cggtatcgtg ctggcgacggggatcaccgt gcatcggaac 300 tcggcgcagc aggccggcgt gggcacgata ttcgaccgtgtcctctcgga gcttgtctcg 360 aaaatgcgtg aaatgaagat ggacaggaca gagcttggctgtctcagatc tataatactc 420 ttcaatcccg aggttcgagg actgaaatcc atccaggaagtgaccctgct ccgtgagaag 480 atctacggcg ccctggaggg ttattgccgc gtagcttggcccgacgacgc tggaagattc 540 gcgaaattac ttctacgcct gcccgccatc cgctcgatcggattaaagtg cctcgagtac 600 ctgttcttct tcaaaatgat cggtgacgta ccgatcgacgattttctcgt ggagatgtta 660 gaatcgcgat cagatcctta g 681 27 210 PRT Locustamigratoria 27 His Thr Asp Met Pro Val Glu Arg Ile Leu Glu Ala Glu LysArg Val 1 5 10 15 Glu Cys Lys Ala Glu Asn Gln Val Glu Tyr Glu Leu ValGlu Trp Ala 20 25 30 Lys His Ile Pro His Phe Thr Ser Leu Pro Leu Glu AspGln Val Leu 35 40 45 Leu Leu Arg Ala Gly Trp Asn Glu Leu Leu Ile Ala AlaPhe Ser His 50 55 60 Arg Ser Val Asp Val Lys Asp Gly Ile Val Leu Ala ThrGly Leu Thr 65 70 75 80 Val His Arg Asn Ser Ala His Gln Ala Gly Val GlyThr Ile Phe Asp 85 90 95 Arg Val Leu Thr Glu Leu Val Ala Lys Met Arg GluMet Lys Met Asp 100 105 110 Lys Thr Glu Leu Gly Cys Leu Arg Ser Val IleLeu Phe Asn Pro Glu 115 120 125 Val Arg Gly Leu Lys Ser Ala Gln Glu ValGlu Leu Leu Arg Glu Lys 130 135 140 Val Tyr Ala Ala Leu Glu Glu Tyr ThrArg Thr Thr His Pro Asp Glu 145 150 155 160 Pro Gly Arg Phe Ala Lys LeuLeu Leu Arg Leu Pro Ser Leu Arg Ser 165 170 175 Ile Gly Leu Lys Cys LeuGlu His Leu Phe Phe Phe Arg Leu Ile Gly 180 185 190 Asp Val Pro Ile AspThr Phe Leu Met Glu Met Leu Glu Ser Pro Ser 195 200 205 Asp Ser 210 28228 PRT Amblyomma americanum 28 Pro Pro Glu Met Pro Leu Glu Arg Ile LeuGlu Ala Glu Leu Arg Val 1 5 10 15 Glu Ser Gln Thr Gly Thr Leu Ser GluSer Ala Gln Gln Gln Asp Pro 20 25 30 Val Ser Ser Ile Cys Gln Ala Ala AspArg Gln Leu His Gln Leu Val 35 40 45 Gln Trp Ala Lys His Ile Pro His PheGlu Glu Leu Pro Leu Glu Asp 50 55 60 Arg Met Val Leu Leu Lys Ala Gly TrpAsn Glu Leu Leu Ile Ala Ala 65 70 75 80 Phe Ser His Arg Ser Val Asp ValArg Asp Gly Ile Val Leu Ala Thr 85 90 95 Gly Leu Val Val Gln Arg His SerAla His Gly Ala Gly Val Gly Ala 100 105 110 Ile Phe Asp Arg Val Leu ThrGlu Leu Val Ala Lys Met Arg Glu Met 115 120 125 Lys Met Asp Arg Thr GluLeu Gly Cys Leu Leu Ala Val Val Leu Phe 130 135 140 Asn Pro Glu Ala LysGly Leu Arg Thr Cys Pro Ser Gly Gly Pro Glu 145 150 155 160 Gly Glu SerVal Ser Ala Leu Glu Glu His Cys Arg Gln Gln Tyr Pro 165 170 175 Asp GlnPro Gly Arg Phe Ala Lys Leu Leu Leu Arg Leu Pro Ala Leu 180 185 190 ArgSer Ile Gly Leu Lys Cys Leu Glu His Leu Phe Phe Phe Lys Leu 195 200 205Ile Gly Asp Thr Pro Ile Asp Asn Phe Leu Leu Ser Met Leu Glu Ala 210 215220 Pro Ser Asp Pro 225 29 230 PRT Amblyomma americanum 29 Ser Pro AspMet Pro Leu Glu Arg Ile Leu Glu Ala Glu Met Arg Val 1 5 10 15 Glu GlnPro Ala Pro Ser Val Leu Ala Gln Thr Ala Ala Ser Gly Arg 20 25 30 Asp ProVal Asn Ser Met Cys Gln Ala Ala Pro Pro Leu His Glu Leu 35 40 45 Val GlnTrp Ala Arg Arg Ile Pro His Phe Glu Glu Leu Pro Ile Glu 50 55 60 Asp ArgThr Ala Leu Leu Lys Ala Gly Trp Asn Glu Leu Leu Ile Ala 65 70 75 80 AlaPhe Ser His Arg Ser Val Ala Val Arg Asp Gly Ile Val Leu Ala 85 90 95 ThrGly Leu Val Val Gln Arg His Ser Ala His Gly Ala Gly Val Gly 100 105 110Asp Ile Phe Asp Arg Val Leu Ala Glu Leu Val Ala Lys Met Arg Asp 115 120125 Met Lys Met Asp Lys Thr Glu Leu Gly Cys Leu Arg Ala Val Val Leu 130135 140 Phe Asn Pro Asp Ala Lys Gly Leu Arg Asn Ala Thr Arg Val Glu Ala145 150 155 160 Leu Arg Glu Lys Val Tyr Ala Ala Leu Glu Glu His Cys ArgArg His 165 170 175 His Pro Asp Gln Pro Gly Arg Phe Gly Lys Leu Leu LeuArg Leu Pro 180 185 190 Ala Leu Arg Ser Ile Gly Leu Lys Cys Leu Glu HisLeu Phe Phe Phe 195 200 205 Lys Leu Ile Gly Asp Thr Pro Ile Asp Ser PheLeu Leu Asn Met Leu 210 215 220 Glu Ala Pro Ala Asp Pro 225 230 30 266PRT Celuca pugilator 30 Ser Asp Met Pro Ile Ala Ser Ile Arg Glu Ala GluLeu Ser Val Asp 1 5 10 15 Pro Ile Asp Glu Gln Pro Leu Asp Gln Gly ValArg Leu Gln Val Pro 20 25 30 Leu Ala Pro Pro Asp Ser Glu Lys Cys Ser PheThr Leu Pro Phe His 35 40 45 Pro Val Ser Glu Val Ser Cys Ala Asn Pro LeuGln Asp Val Val Ser 50 55 60 Asn Ile Cys Gln Ala Ala Asp Arg His Leu ValGln Leu Val Glu Trp 65 70 75 80 Ala Lys His Ile Pro His Phe Thr Asp LeuPro Ile Glu Asp Gln Val 85 90 95 Val Leu Leu Lys Ala Gly Trp Asn Glu LeuLeu Ile Ala Ser Phe Ser 100 105 110 His Arg Ser Met Gly Val Glu Asp GlyIle Val Leu Ala Thr Gly Leu 115 120 125 Val Ile His Arg Ser Ser Ala HisGln Ala Gly Val Gly Ala Ile Phe 130 135 140 Asp Arg Val Leu Ser Glu LeuVal Ala Lys Met Lys Glu Met Lys Ile 145 150 155 160 Asp Lys Thr Glu LeuGly Cys Leu Arg Ser Ile Val Leu Phe Asn Pro 165 170 175 Asp Ala Lys GlyLeu Asn Cys Val Asn Asp Val Glu Ile Leu Arg Glu 180 185 190 Lys Val TyrAla Ala Leu Glu Glu Tyr Thr Arg Thr Thr Tyr Pro Asp 195 200 205 Glu ProGly Arg Phe Ala Lys Leu Leu Leu Arg Leu Pro Ala Leu Arg 210 215 220 SerIle Gly Leu Lys Cys Leu Glu Tyr Leu Phe Leu Phe Lys Leu Ile 225 230 235240 Gly Asp Thr Pro Leu Asp Ser Tyr Leu Met Lys Met Leu Val Asp Asn 245250 255 Pro Asn Thr Ser Val Thr Pro Pro Thr Ser 260 265 31 229 PRTTenebrio molitor 31 Ala Glu Met Pro Leu Asp Arg Ile Ile Glu Ala Glu LysArg Ile Glu 1 5 10 15 Cys Thr Pro Ala Gly Gly Ser Gly Gly Val Gly GluGln His Asp Gly 20 25 30 Val Asn Asn Ile Cys Gln Ala Thr Asn Lys Gln LeuPhe Gln Leu Val 35 40 45 Gln Trp Ala Lys Leu Ile Pro His Phe Thr Ser LeuPro Met Ser Asp 50 55 60 Gln Val Leu Leu Leu Arg Ala Gly Trp Asn Glu LeuLeu Ile Ala Ala 65 70 75 80 Phe Ser His Arg Ser Ile Gln Ala Gln Asp AlaIle Val Leu Ala Thr 85 90 95 Gly Leu Thr Val Asn Lys Thr Ser Ala His AlaVal Gly Val Gly Asn 100 105 110 Ile Tyr Asp Arg Val Leu Ser Glu Leu ValAsn Lys Met Lys Glu Met 115 120 125 Lys Met Asp Lys Thr Glu Leu Gly CysLeu Arg Ala Ile Ile Leu Tyr 130 135 140 Asn Pro Thr Cys Arg Gly Ile LysSer Val Gln Glu Val Glu Met Leu 145 150 155 160 Arg Glu Lys Ile Tyr GlyVal Leu Glu Glu Tyr Thr Arg Thr Thr His 165 170 175 Pro Asn Glu Pro GlyArg Phe Ala Lys Leu Leu Leu Arg Leu Pro Ala 180 185 190 Leu Arg Ser IleGly Leu Lys Cys Ser Glu His Leu Phe Phe Phe Lys 195 200 205 Leu Ile GlyAsp Val Pro Ile Asp Thr Phe Leu Met Glu Met Leu Glu 210 215 220 Ser ProAla Asp Ala 225 32 226 PRT Apis mellifera 32 His Ser Asp Met Pro Ile GluArg Ile Leu Glu Ala Glu Lys Arg Val 1 5 10 15 Glu Cys Lys Met Glu GlnGln Gly Asn Tyr Glu Asn Ala Val Ser His 20 25 30 Ile Cys Asn Ala Thr AsnLys Gln Leu Phe Gln Leu Val Ala Trp Ala 35 40 45 Lys His Ile Pro His PheThr Ser Leu Pro Leu Glu Asp Gln Val Leu 50 55 60 Leu Leu Arg Ala Gly TrpAsn Glu Leu Leu Ile Ala Ser Phe Ser His 65 70 75 80 Arg Ser Ile Asp ValLys Asp Gly Ile Val Leu Ala Thr Gly Ile Thr 85 90 95 Val His Arg Asn SerAla Gln Gln Ala Gly Val Gly Thr Ile Phe Asp 100 105 110 Arg Val Leu SerGlu Leu Val Ser Lys Met Arg Glu Met Lys Met Asp 115 120 125 Arg Thr GluLeu Gly Cys Leu Arg Ser Ile Ile Leu Phe Asn Pro Glu 130 135 140 Val ArgGly Leu Lys Ser Ile Gln Glu Val Thr Leu Leu Arg Glu Lys 145 150 155 160Ile Tyr Gly Ala Leu Glu Gly Tyr Cys Arg Val Ala Trp Pro Asp Asp 165 170175 Ala Gly Arg Phe Ala Lys Leu Leu Leu Arg Leu Pro Ala Ile Arg Ser 180185 190 Ile Gly Leu Lys Cys Leu Glu Tyr Leu Phe Phe Phe Lys Met Ile Gly195 200 205 Asp Val Pro Ile Asp Asp Phe Leu Val Glu Met Leu Glu Ser ArgSer 210 215 220 Asp Pro 225 33 516 DNA Locusta migratoria 33 atccctacctctggaggacc aggttctcct cctcagagca ggttggaatg aactgctaat 60 tgcagcattttcacatcgat ctgtagatgt taaagatggc atagtacttg ccactggtct 120 cacagtgcatcgaaattctg cccatcaagc tggagtcggc acaatatttg acagagtttt 180 gacagaactggtagcaaaga tgagagaaat gaaaatggat aaaactgaac ttggctgctt 240 gcgatctgttattcttttca atccagaggt gaggggtttg aaatccgccc aggaagttga 300 acttctacgtgaaaaagtat atgccgcttt ggaagaatat actagaacaa cacatcccga 360 tgaaccaggaagatttgcaa aacttttgct tcgtctgcct tctttacgtt ccataggcct 420 taagtgtttggagcatttgt tttctttcgc cttattggag atgttccaat tgatacgttc 480 ctgatggagatgcttgaatc accttctgat tcataa 516 34 528 DNA Amblyomma americanum 34attccacatt ttgaagagct tccccttgag gaccgcatgg tgttgctcaa ggctggctgg 60aacgagctgc tcattgctgc tttctcccac cgttctgttg acgtgcgtga tggcattgtg 120ctcgctacag gtcttgtggt gcagcggcat agtgctcatg gggctggcgt tggggccata 180tttgataggg ttctcactga actggtagca aagatgcgtg agatgaagat ggaccgcact 240gagcttggat gcctgcttgc tgtggtactt tttaatcctg aggccaaggg gctgcggacc 300tgcccaagtg gaggccctga gggagaaagt gtatctgcct tggaagagca ctgccggcag 360cagtacccag accagcctgg gcgctttgcc aagctgctgc tgcggttgcc agctctgcgc 420agtattggcc tcaagtgcct cgaacatctc tttttcttca agctcatcgg ggacacgccc 480atcgacaact ttcttctttc catgctggag gccccctctg acccctaa 528 35 531 DNAAmblyomma americanum 35 attccgcact tcgaagagct tcccatcgag gatcgcaccgcgctgctcaa agccggctgg 60 aacgaactgc ttattgccgc cttttcgcac cgttctgtggcggtgcgcga cggcatcgtt 120 ctggccaccg ggctggtggt gcagcggcac agcgcacacggcgcaggcgt tggcgacatc 180 ttcgaccgcg tactagccga gctggtggcc aagatgcgcgacatgaagat ggacaaaacg 240 gagctcggct gcctgcgcgc cgtggtgctc ttcaatccagacgccaaggg tctccgaaac 300 gccaccagag tagaggcgct ccgcgagaag gtgtatgcggcgctggagga gcactgccgt 360 cggcaccacc cggaccaacc gggtcgcttc ggcaagctgctgctgcggct gcctgccttg 420 cgcagcatcg ggctcaaatg cctcgagcat ctgttcttcttcaagctcat cggagacact 480 cccatagaca gcttcctgct caacatgctg gaggcaccggcagaccccta g 531 36 552 DNA Celuca pugilator 36 atcccacact tcacagaccttcccatagag gaccaagtgg tattactcaa agccgggtgg 60 aacgagttgc ttattgcctcattctcacac cgtagcatgg gcgtggagga tggcatcgtg 120 ctggccacag ggctcgtgatccacagaagt agtgctcacc aggctggagt gggtgccata 180 tttgatcgtg tcctctctgagctggtggcc aagatgaagg agatgaagat tgacaagaca 240 gagctgggct gccttcgctccatcgtcctg ttcaacccag atgccaaagg actaaactgc 300 gtcaatgatg tggagatcttgcgtgagaag gtgtatgctg ccctggagga gtacacacga 360 accacttacc ctgatgaacctggacgcttt gccaagttgc ttctgcgact tcctgcactc 420 aggtctatag gcctgaagtgtcttgagtac ctcttcctgt ttaagctgat tggagacact 480 cccctggaca gctacttgatgaagatgctc gtagacaacc caaatacaag cgtcactccc 540 cccaccagct ag 552 37 531DNA Tenebrio molitor 37 atacctcact ttacctcgtt gccgatgtcg gaccaggtgcttttattgag ggcaggatgg 60 aatgaattgc tcatcgccgc attctcgcac agatctatacaggcgcagga tgccatcgtt 120 ctagccacgg ggttgacagt taacaaaacg tcggcgcacgccgtgggcgt gggcaacatc 180 tacgaccgcg tcctctccga gctggtgaac aagatgaaagagatgaagat ggacaagacg 240 gagctgggct gcttgagagc catcatcctc tacaaccccacgtgtcgcgg catcaagtcc 300 gtgcaggaag tggagatgct gcgtgagaaa atttacggcgtgctggaaga gtacaccagg 360 accacccacc cgaacgagcc cggcaggttc gccaaactgcttctgcgcct cccggccctc 420 aggtccatcg ggttgaaatg ttccgaacac ctctttttcttcaagctgat cggtgatgtt 480 ccaatagaca cgttcctgat ggagatgctg gagtctccggcggacgctta g 531 38 531 DNA Apis mellifera 38 atcccgcatt ttacctcgttgccactggag gatcaggtac ttctgctcag ggccggttgg 60 aacgagttgc tgatagcctccttttcccac cgttccatcg acgtgaagga cggtatcgtg 120 ctggcgacgg ggatcaccgtgcatcggaac tcggcgcagc aggccggcgt gggcacgata 180 ttcgaccgtg tcctctcggagcttgtctcg aaaatgcgtg aaatgaagat ggacaggaca 240 gagcttggct gtctcagatctataatactc ttcaatcccg aggttcgagg actgaaatcc 300 atccaggaag tgaccctgctccgtgagaag atctacggcg ccctggaggg ttattgccgc 360 gtagcttggc ccgacgacgctggaagattc gcgaaattac ttctacgcct gcccgccatc 420 cgctcgatcg gattaaagtgcctcgagtac ctgttcttct tcaaaatgat cggtgacgta 480 ccgatcgacg attttctcgtggagatgtta gaatcgcgat cagatcctta g 531 39 176 PRT Locusta migratoria 39Ile Pro His Phe Thr Ser Leu Pro Leu Glu Asp Gln Val Leu Leu Leu 1 5 1015 Arg Ala Gly Trp Asn Glu Leu Leu Ile Ala Ala Phe Ser His Arg Ser 20 2530 Val Asp Val Lys Asp Gly Ile Val Leu Ala Thr Gly Leu Thr Val His 35 4045 Arg Asn Ser Ala His Gln Ala Gly Val Gly Thr Ile Phe Asp Arg Val 50 5560 Leu Thr Glu Leu Val Ala Lys Met Arg Glu Met Lys Met Asp Lys Thr 65 7075 80 Glu Leu Gly Cys Leu Arg Ser Val Ile Leu Phe Asn Pro Glu Val Arg 8590 95 Gly Leu Lys Ser Ala Gln Glu Val Glu Leu Leu Arg Glu Lys Val Tyr100 105 110 Ala Ala Leu Glu Glu Tyr Thr Arg Thr Thr His Pro Asp Glu ProGly 115 120 125 Arg Phe Ala Lys Leu Leu Leu Arg Leu Pro Ser Leu Arg SerIle Gly 130 135 140 Leu Lys Cys Leu Glu His Leu Phe Phe Phe Arg Leu IleGly Asp Val 145 150 155 160 Pro Ile Asp Thr Phe Leu Met Glu Met Leu GluSer Pro Ser Asp Ser 165 170 175 40 175 PRT Amblyomma americanum 40 IlePro His Phe Glu Glu Leu Pro Leu Glu Asp Arg Met Val Leu Leu 1 5 10 15Lys Ala Gly Trp Asn Glu Leu Leu Ile Ala Ala Phe Ser His Arg Ser 20 25 30Val Asp Val Arg Asp Gly Ile Val Leu Ala Thr Gly Leu Val Val Gln 35 40 45Arg His Ser Ala His Gly Ala Gly Val Gly Ala Ile Phe Asp Arg Val 50 55 60Leu Thr Glu Leu Val Ala Lys Met Arg Glu Met Lys Met Asp Arg Thr 65 70 7580 Glu Leu Gly Cys Leu Leu Ala Val Val Leu Phe Asn Pro Glu Ala Lys 85 9095 Gly Leu Arg Thr Cys Pro Ser Gly Gly Pro Glu Gly Glu Ser Val Ser 100105 110 Ala Leu Glu Glu His Cys Arg Gln Gln Tyr Pro Asp Gln Pro Gly Arg115 120 125 Phe Ala Lys Leu Leu Leu Arg Leu Pro Ala Leu Arg Ser Ile GlyLeu 130 135 140 Lys Cys Leu Glu His Leu Phe Phe Phe Lys Leu Ile Gly AspThr Pro 145 150 155 160 Ile Asp Asn Phe Leu Leu Ser Met Leu Glu Ala ProSer Asp Pro 165 170 175 41 176 PRT Amblyomma americanum 41 Ile Pro HisPhe Glu Glu Leu Pro Ile Glu Asp Arg Thr Ala Leu Leu 1 5 10 15 Lys AlaGly Trp Asn Glu Leu Leu Ile Ala Ala Phe Ser His Arg Ser 20 25 30 Val AlaVal Arg Asp Gly Ile Val Leu Ala Thr Gly Leu Val Val Gln 35 40 45 Arg HisSer Ala His Gly Ala Gly Val Gly Asp Ile Phe Asp Arg Val 50 55 60 Leu AlaGlu Leu Val Ala Lys Met Arg Asp Met Lys Met Asp Lys Thr 65 70 75 80 GluLeu Gly Cys Leu Arg Ala Val Val Leu Phe Asn Pro Asp Ala Lys 85 90 95 GlyLeu Arg Asn Ala Thr Arg Val Glu Ala Leu Arg Glu Lys Val Tyr 100 105 110Ala Ala Leu Glu Glu His Cys Arg Arg His His Pro Asp Gln Pro Gly 115 120125 Arg Phe Gly Lys Leu Leu Leu Arg Leu Pro Ala Leu Arg Ser Ile Gly 130135 140 Leu Lys Cys Leu Glu His Leu Phe Phe Phe Lys Leu Ile Gly Asp Thr145 150 155 160 Pro Ile Asp Ser Phe Leu Leu Asn Met Leu Glu Ala Pro AlaAsp Pro 165 170 175 42 183 PRT Celuca pugilator 42 Ile Pro His Phe ThrAsp Leu Pro Ile Glu Asp Gln Val Val Leu Leu 1 5 10 15 Lys Ala Gly TrpAsn Glu Leu Leu Ile Ala Ser Phe Ser His Arg Ser 20 25 30 Met Gly Val GluAsp Gly Ile Val Leu Ala Thr Gly Leu Val Ile His 35 40 45 Arg Ser Ser AlaHis Gln Ala Gly Val Gly Ala Ile Phe Asp Arg Val 50 55 60 Leu Ser Glu LeuVal Ala Lys Met Lys Glu Met Lys Ile Asp Lys Thr 65 70 75 80 Glu Leu GlyCys Leu Arg Ser Ile Val Leu Phe Asn Pro Asp Ala Lys 85 90 95 Gly Leu AsnCys Val Asn Asp Val Glu Ile Leu Arg Glu Lys Val Tyr 100 105 110 Ala AlaLeu Glu Glu Tyr Thr Arg Thr Thr Tyr Pro Asp Glu Pro Gly 115 120 125 ArgPhe Ala Lys Leu Leu Leu Arg Leu Pro Ala Leu Arg Ser Ile Gly 130 135 140Leu Lys Cys Leu Glu Tyr Leu Phe Leu Phe Lys Leu Ile Gly Asp Thr 145 150155 160 Pro Leu Asp Ser Tyr Leu Met Lys Met Leu Val Asp Asn Pro Asn Thr165 170 175 Ser Val Thr Pro Pro Thr Ser 180 43 176 PRT Tenebrio molitor43 Ile Pro His Phe Thr Ser Leu Pro Met Ser Asp Gln Val Leu Leu Leu 1 510 15 Arg Ala Gly Trp Asn Glu Leu Leu Ile Ala Ala Phe Ser His Arg Ser 2025 30 Ile Gln Ala Gln Asp Ala Ile Val Leu Ala Thr Gly Leu Thr Val Asn 3540 45 Lys Thr Ser Ala His Ala Val Gly Val Gly Asn Ile Tyr Asp Arg Val 5055 60 Leu Ser Glu Leu Val Asn Lys Met Lys Glu Met Lys Met Asp Lys Thr 6570 75 80 Glu Leu Gly Cys Leu Arg Ala Ile Ile Leu Tyr Asn Pro Thr Cys Arg85 90 95 Gly Ile Lys Ser Val Gln Glu Val Glu Met Leu Arg Glu Lys Ile Tyr100 105 110 Gly Val Leu Glu Glu Tyr Thr Arg Thr Thr His Pro Asn Glu ProGly 115 120 125 Arg Phe Ala Lys Leu Leu Leu Arg Leu Pro Ala Leu Arg SerIle Gly 130 135 140 Leu Lys Cys Ser Glu His Leu Phe Phe Phe Lys Leu IleGly Asp Val 145 150 155 160 Pro Ile Asp Thr Phe Leu Met Glu Met Leu GluSer Pro Ala Asp Ala 165 170 175 44 176 PRT Apis mellifera 44 Ile Pro HisPhe Thr Ser Leu Pro Leu Glu Asp Gln Val Leu Leu Leu 1 5 10 15 Arg AlaGly Trp Asn Glu Leu Leu Ile Ala Ser Phe Ser His Arg Ser 20 25 30 Ile AspVal Lys Asp Gly Ile Val Leu Ala Thr Gly Ile Thr Val His 35 40 45 Arg AsnSer Ala Gln Gln Ala Gly Val Gly Thr Ile Phe Asp Arg Val 50 55 60 Leu SerGlu Leu Val Ser Lys Met Arg Glu Met Lys Met Asp Arg Thr 65 70 75 80 GluLeu Gly Cys Leu Arg Ser Ile Ile Leu Phe Asn Pro Glu Val Arg 85 90 95 GlyLeu Lys Ser Ile Gln Glu Val Thr Leu Leu Arg Glu Lys Ile Tyr 100 105 110Gly Ala Leu Glu Gly Tyr Cys Arg Val Ala Trp Pro Asp Asp Ala Gly 115 120125 Arg Phe Ala Lys Leu Leu Leu Arg Leu Pro Ala Ile Arg Ser Ile Gly 130135 140 Leu Lys Cys Leu Glu Tyr Leu Phe Phe Phe Lys Met Ile Gly Asp Val145 150 155 160 Pro Ile Asp Asp Phe Leu Val Glu Met Leu Glu Ser Arg SerAsp Pro 165 170 175 45 711 DNA Artificial Sequence Chimeric RXR ligandbinding domain 45 gccaacgagg acatgcctgt agagaagatt ctggaagccg agcttgctgtcgagcccaag 60 actgagacat acgtggaggc aaacatgggg ctgaacccca gctcaccaaatgaccctgtt 120 accaacatct gtcaagcagc agacaagcag ctcttcactc ttgtggagtgggccaagagg 180 atcccacact tttctgagct gcccctagac gaccaggtca tcctgctacgggcaggctgg 240 aacgagctgc tgatcgcctc cttctcccac cgctccatag ctgtgaaagatgggattctc 300 ctggccaccg gcctgcacgt acaccggaac agcgctcaca gtgctggggtgggcgccatc 360 tttgacaggg tgctaacaga gctggtgtct aagatgcgtg acatgcagatggacaagact 420 gaacttggct gcttgcgatc tgttattctt ttcaatccag aggtgaggggtttgaaatcc 480 gcccaggaag ttgaacttct acgtgaaaaa gtatatgccg ctttggaagaatatactaga 540 acaacacatc ccgatgaacc aggaagattt gcaaaacttt tgcttcgtctgccttcttta 600 cgttccatag gccttaagtg tttggagcat ttgtttttct ttcgccttattggagatgtt 660 ccaattgata cgttcctgat ggagatgctt gaatcacctt ctgattcata a711 46 236 PRT Artificial Sequence Chimeric RXR ligand binding domain 46Ala Asn Glu Asp Met Pro Val Glu Lys Ile Leu Glu Ala Glu Leu Ala 1 5 1015 Val Glu Pro Lys Thr Glu Thr Tyr Val Glu Ala Asn Met Gly Leu Asn 20 2530 Pro Ser Ser Pro Asn Asp Pro Val Thr Asn Ile Cys Gln Ala Ala Asp 35 4045 Lys Gln Leu Phe Thr Leu Val Glu Trp Ala Lys Arg Ile Pro His Phe 50 5560 Ser Glu Leu Pro Leu Asp Asp Gln Val Ile Leu Leu Arg Ala Gly Trp 65 7075 80 Asn Glu Leu Leu Ile Ala Ser Phe Ser His Arg Ser Ile Ala Val Lys 8590 95 Asp Gly Ile Leu Leu Ala Thr Gly Leu His Val His Arg Asn Ser Ala100 105 110 His Ser Ala Gly Val Gly Ala Ile Phe Asp Arg Val Leu Thr GluLeu 115 120 125 Val Ser Lys Met Arg Asp Met Gln Met Asp Lys Thr Glu LeuGly Cys 130 135 140 Leu Arg Ser Val Ile Leu Phe Asn Pro Glu Val Arg GlyLeu Lys Ser 145 150 155 160 Ala Gln Glu Val Glu Leu Leu Arg Glu Lys ValTyr Ala Ala Leu Glu 165 170 175 Glu Tyr Thr Arg Thr Thr His Pro Asp GluPro Gly Arg Phe Ala Lys 180 185 190 Leu Leu Leu Arg Leu Pro Ser Leu ArgSer Ile Gly Leu Lys Cys Leu 195 200 205 Glu His Leu Phe Phe Phe Arg LeuIle Gly Asp Val Pro Ile Asp Thr 210 215 220 Phe Leu Met Glu Met Leu GluSer Pro Ser Asp Ser 225 230 235 47 441 DNA Saccharomyces cerevisiae 47atgaagctac tgtcttctat cgaacaagca tgcgatattt gccgacttaa aaagctcaag 60tgctccaaag aaaaaccgaa gtgcgccaag tgtctgaaga acaactggga gtgtcgctac 120tctcccaaaa ccaaaaggtc tccgctgact agggcacatc tgacagaagt ggaatcaagg 180ctagaaagac tggaacagct atttctactg atttttcctc gagaagacct tgacatgatt 240ttgaaaatgg attctttaca ggatataaaa gcattgttaa caggattatt tgtacaagat 300aatgtgaata aagatgccgt cacagataga ttggcttcag tggagactga tatgcctcta 360acattgagac agcatagaat aagtgcgaca tcatcatcgg aagagagtag taacaaaggt 420caaagacagt tgactgtatc g 441 48 147 PRT Saccharomyces cerevisiae 48 MetLys Leu Leu Ser Ser Ile Glu Gln Ala Cys Asp Ile Cys Arg Leu 1 5 10 15Lys Lys Leu Lys Cys Ser Lys Glu Lys Pro Lys Cys Ala Lys Cys Leu 20 25 30Lys Asn Asn Trp Glu Cys Arg Tyr Ser Pro Lys Thr Lys Arg Ser Pro 35 40 45Leu Thr Arg Ala His Leu Thr Glu Val Glu Ser Arg Leu Glu Arg Leu 50 55 60Glu Gln Leu Phe Leu Leu Ile Phe Pro Arg Glu Asp Leu Asp Met Ile 65 70 7580 Leu Lys Met Asp Ser Leu Gln Asp Ile Lys Ala Leu Leu Thr Gly Leu 85 9095 Phe Val Gln Asp Asn Val Asn Lys Asp Ala Val Thr Asp Arg Leu Ala 100105 110 Ser Val Glu Thr Asp Met Pro Leu Thr Leu Arg Gln His Arg Ile Ser115 120 125 Ala Thr Ser Ser Ser Glu Glu Ser Ser Asn Lys Gly Gln Arg GlnLeu 130 135 140 Thr Val Ser 145 49 606 DNA Escherichia coli 49atgaaagcgt taacggccag gcaacaagag gtgtttgatc tcatccgtga tcacatcagc 60cagacaggta tgccgccgac gcgtgcggaa atcgcgcagc gtttggggtt ccgttcccca 120aacgcggctg aagaacatct gaaggcgctg gcacgcaaag gcgttattga aattgtttcc 180ggcgcatcac gcgggattcg tctgttgcag gaagaggaag aagggttgcc gctggtaggt 240cgtgtggctg ccggtgaacc acttctggcg caacagcata ttgaaggtca ttatcaggtc 300gatccttcct tattcaagcc gaatgctgat ttcctgctgc gcgtcagcgg gatgtcgatg 360aaagatatcg gcattatgga tggtgacttg ctggcagtgc ataaaactca ggatgtacgt 420aacggtcagg tcgttgtcgc acgtattgat gacgaagtta ccgttaagcg cctgaaaaaa 480cagggcaata aagtcgaact gttgccagaa aatagcgagt ttaaaccaat tgtcgtagat 540cttcgtcagc agagcttcac cattgaaggg ctggcggttg gggttattcg caacggcgac 600tggctg 606 50 202 PRT Escherichia coli 50 Met Lys Ala Leu Thr Ala ArgGln Gln Glu Val Phe Asp Leu Ile Arg 1 5 10 15 Asp His Ile Ser Gln ThrGly Met Pro Pro Thr Arg Ala Glu Ile Ala 20 25 30 Gln Arg Leu Gly Phe ArgSer Pro Asn Ala Ala Glu Glu His Leu Lys 35 40 45 Ala Leu Ala Arg Lys GlyVal Ile Glu Ile Val Ser Gly Ala Ser Arg 50 55 60 Gly Ile Arg Leu Leu GlnGlu Glu Glu Glu Gly Leu Pro Leu Val Gly 65 70 75 80 Arg Val Ala Ala GlyGlu Pro Leu Leu Ala Gln Gln His Ile Glu Gly 85 90 95 His Tyr Gln Val AspPro Ser Leu Phe Lys Pro Asn Ala Asp Phe Leu 100 105 110 Leu Arg Val SerGly Met Ser Met Lys Asp Ile Gly Ile Met Asp Gly 115 120 125 Asp Leu LeuAla Val His Lys Thr Gln Asp Val Arg Asn Gly Gln Val 130 135 140 Val ValAla Arg Ile Asp Asp Glu Val Thr Val Lys Arg Leu Lys Lys 145 150 155 160Gln Gly Asn Lys Val Glu Leu Leu Pro Glu Asn Ser Glu Phe Lys Pro 165 170175 Ile Val Val Asp Leu Arg Gln Gln Ser Phe Thr Ile Glu Gly Leu Ala 180185 190 Val Gly Val Ile Arg Asn Gly Asp Trp Leu 195 200 51 271 DNAherpes simplex virus 7 51 atgggcccta aaaagaagcg taaagtcgcc cccccgaccgatgtcagcct gggggacgag 60 ctccacttag acggcgagga cgtggcgatg gcgcatgccgacgcgctaga cgatttcgat 120 ctggacatgt tgggggacgg ggattccccg gggccgggatttacccccca cgactccgcc 180 ccctacggcg ctctggatat ggccgacttc gagtttgagcagatgtttac cgatgccctt 240 ggaattgacg agtacggtgg ggaattcccg g 271 52 90PRT herpes simplex virus 7 52 Met Gly Pro Lys Lys Lys Arg Lys Val AlaPro Pro Thr Asp Val Ser 1 5 10 15 Leu Gly Asp Glu Leu His Leu Asp GlyGlu Asp Val Ala Met Ala His 20 25 30 Ala Asp Ala Leu Asp Asp Phe Asp LeuAsp Met Leu Gly Asp Gly Asp 35 40 45 Ser Pro Gly Pro Gly Phe Thr Pro HisAsp Ser Ala Pro Tyr Gly Ala 50 55 60 Leu Asp Met Ala Asp Phe Glu Phe GluGln Met Phe Thr Asp Ala Leu 65 70 75 80 Gly Ile Asp Glu Tyr Gly Gly GluPhe Pro 85 90 53 307 DNA Saccharomyces cerevisiae 53 atgggtgctcctccaaaaaa gaagagaaag gtagctggta tcaataaaga tatcgaggag 60 tgcaatgccatcattgagca gtttatcgac tacctgcgca ccggacagga gatgccgatg 120 gaaatggcggatcaggcgat taacgtggtg ccgggcatga cgccgaaaac cattcttcac 180 gccgggccgccgatccagcc tgactggctg aaatcgaatg gttttcatga aattgaagcg 240 gatgttaacgataccagcct cttgctgagt ggagatgcct cctaccctta tgatgtgcca 300 gattatg 30754 102 PRT Saccharomyces cerevisiae 54 Met Gly Ala Pro Pro Lys Lys LysArg Lys Val Ala Gly Ile Asn Lys 1 5 10 15 Asp Ile Glu Glu Cys Asn AlaIle Ile Glu Gln Phe Ile Asp Tyr Leu 20 25 30 Arg Thr Gly Gln Glu Met ProMet Glu Met Ala Asp Gln Ala Ile Asn 35 40 45 Val Val Pro Gly Met Thr ProLys Thr Ile Leu His Ala Gly Pro Pro 50 55 60 Ile Gln Pro Asp Trp Leu LysSer Asn Gly Phe His Glu Ile Glu Ala 65 70 75 80 Asp Val Asn Asp Thr SerLeu Leu Leu Ser Gly Asp Ala Ser Tyr Pro 85 90 95 Tyr Asp Val Pro Asp Tyr100 55 19 DNA Artificial Sequence GAL4 response element 55 ggagtactgtcctccgagc 19 56 36 DNA Artificial Sequence 2xLexAop response element 56ctgctgtata taaaaccagt ggttatatgt acagta 36 57 334 PRT Choristoneurafumiferana 57 Pro Glu Cys Val Val Pro Glu Thr Gln Cys Ala Met Lys ArgLys Glu 1 5 10 15 Lys Lys Ala Gln Lys Glu Lys Asp Lys Leu Pro Val SerThr Thr Thr 20 25 30 Val Asp Asp His Met Pro Pro Ile Met Gln Cys Glu ProPro Pro Pro 35 40 45 Glu Ala Ala Arg Ile His Glu Val Val Pro Arg Phe LeuSer Asp Lys 50 55 60 Leu Leu Glu Thr Asn Arg Gln Lys Asn Ile Pro Gln LeuThr Ala Asn 65 70 75 80 Gln Gln Phe Leu Ile Ala Arg Leu Ile Trp Tyr GlnAsp Gly Tyr Glu 85 90 95 Gln Pro Ser Asp Glu Asp Leu Lys Arg Ile Thr GlnThr Trp Gln Gln 100 105 110 Ala Asp Asp Glu Asn Glu Glu Ser Asp Thr ProPhe Arg Gln Ile Thr 115 120 125 Glu Met Thr Ile Leu Thr Val Gln Leu IleVal Glu Phe Ala Lys Gly 130 135 140 Leu Pro Gly Phe Ala Lys Ile Ser GlnPro Asp Gln Ile Thr Leu Leu 145 150 155 160 Lys Ala Cys Ser Ser Glu ValMet Met Leu Arg Val Ala Arg Arg Tyr 165 170 175 Asp Ala Ala Ser Asp SerVal Leu Phe Ala Asn Asn Gln Ala Tyr Thr 180 185 190 Arg Asp Asn Tyr ArgLys Ala Gly Met Ala Tyr Val Ile Glu Asp Leu 195 200 205 Leu His Phe CysArg Cys Met Tyr Ser Met Ala Leu Asp Asn Ile His 210 215 220 Tyr Ala LeuLeu Thr Ala Val Val Ile Phe Ser Asp Arg Pro Gly Leu 225 230 235 240 GluGln Pro Gln Leu Val Glu Glu Ile Gln Arg Tyr Tyr Leu Asn Thr 245 250 255Leu Arg Ile Tyr Ile Leu Asn Gln Leu Ser Gly Ser Ala Arg Ser Ser 260 265270 Val Ile Tyr Gly Lys Ile Leu Ser Ile Leu Ser Glu Leu Arg Thr Leu 275280 285 Gly Met Gln Asn Ser Asn Met Cys Ile Ser Leu Lys Leu Lys Asn Arg290 295 300 Lys Leu Pro Pro Phe Leu Glu Glu Ile Trp Asp Val Ala Asp MetSer 305 310 315 320 His Thr Gln Pro Pro Pro Ile Leu Glu Ser Pro Thr AsnLeu 325 330 58 549 PRT Drosophila melanogaster 58 Arg Pro Glu Cys ValVal Pro Glu Asn Gln Cys Ala Met Lys Arg Arg 1 5 10 15 Glu Lys Lys AlaGln Lys Glu Lys Asp Lys Met Thr Thr Ser Pro Ser 20 25 30 Ser Gln His GlyGly Asn Gly Ser Leu Ala Ser Gly Gly Gly Gln Asp 35 40 45 Phe Val Lys LysGlu Ile Leu Asp Leu Met Thr Cys Glu Pro Pro Gln 50 55 60 His Ala Thr IlePro Leu Leu Pro Asp Glu Ile Leu Ala Lys Cys Gln 65 70 75 80 Ala Arg AsnIle Pro Ser Leu Thr Tyr Asn Gln Leu Ala Val Ile Tyr 85 90 95 Lys Leu IleTrp Tyr Gln Asp Gly Tyr Glu Gln Pro Ser Glu Glu Asp 100 105 110 Leu ArgArg Ile Met Ser Gln Pro Asp Glu Asn Glu Ser Gln Thr Asp 115 120 125 ValSer Phe Arg His Ile Thr Glu Ile Thr Ile Leu Thr Val Gln Leu 130 135 140Ile Val Glu Phe Ala Lys Gly Leu Pro Ala Phe Thr Lys Ile Pro Gln 145 150155 160 Glu Asp Gln Ile Thr Leu Leu Lys Ala Cys Ser Ser Glu Val Met Met165 170 175 Leu Arg Met Ala Arg Arg Tyr Asp His Ser Ser Asp Ser Ile PhePhe 180 185 190 Ala Asn Asn Arg Ser Tyr Thr Arg Asp Ser Tyr Lys Met AlaGly Met 195 200 205 Ala Asp Asn Ile Glu Asp Leu Leu His Phe Cys Arg GlnMet Phe Ser 210 215 220 Met Lys Val Asp Asn Val Glu Tyr Ala Leu Leu ThrAla Ile Val Ile 225 230 235 240 Phe Ser Asp Arg Pro Gly Leu Glu Lys AlaGln Leu Val Glu Ala Ile 245 250 255 Gln Ser Tyr Tyr Ile Asp Thr Leu ArgIle Tyr Ile Leu Asn Arg His 260 265 270 Cys Gly Asp Ser Met Ser Leu ValPhe Tyr Ala Lys Leu Leu Ser Ile 275 280 285 Leu Thr Glu Leu Arg Thr LeuGly Asn Gln Asn Ala Glu Met Cys Phe 290 295 300 Ser Leu Lys Leu Lys AsnArg Lys Leu Pro Lys Phe Leu Glu Glu Ile 305 310 315 320 Trp Asp Val HisAla Ile Pro Pro Ser Val Gln Ser His Leu Gln Ile 325 330 335 Thr Gln GluGlu Asn Glu Arg Leu Glu Arg Ala Glu Arg Met Arg Ala 340 345 350 Ser ValGly Gly Ala Ile Thr Ala Gly Ile Asp Cys Asp Ser Ala Ser 355 360 365 ThrSer Ala Ala Ala Ala Ala Ala Gln His Gln Pro Gln Pro Gln Pro 370 375 380Gln Pro Gln Pro Ser Ser Leu Thr Gln Asn Asp Ser Gln His Gln Thr 385 390395 400 Gln Pro Gln Leu Gln Pro Gln Leu Pro Pro Gln Leu Gln Gly Gln Leu405 410 415 Gln Pro Gln Leu Gln Pro Gln Leu Gln Thr Gln Leu Gln Pro GlnIle 420 425 430 Gln Pro Gln Pro Gln Leu Leu Pro Val Ser Ala Pro Val ProAla Ser 435 440 445 Val Thr Ala Pro Gly Ser Leu Ser Ala Val Ser Thr SerSer Glu Tyr 450 455 460 Met Gly Gly Ser Ala Ala Ile Gly Pro Ile Thr ProAla Thr Thr Ser 465 470 475 480 Ser Ile Thr Ala Ala Val Thr Ala Ser SerThr Thr Ser Ala Val Pro 485 490 495 Met Gly Asn Gly Val Gly Val Gly ValGly Val Gly Gly Asn Val Ser 500 505 510 Met Tyr Ala Asn Ala Gln Thr AlaMet Ala Leu Met Gly Val Ala Leu 515 520 525 His Ser His Gln Glu Gln LeuIle Gly Gly Val Ala Val Lys Ser Glu 530 535 540 His Ser Thr Thr Ala 54559 1288 DNA Choristoneura fumiferana 59 aagggccctg cgccccgtca gcaagaggaactgtgtctgg tatgcgggga cagagcctcc 60 ggataccact acaatgcgct cacgtgtgaagggtgtaaag ggttcttcag acggagtgtt 120 accaaaaatg cggtttatat ttgtaaattcggtcacgctt gcgaaatgga catgtacatg 180 cgacggaaat gccaggagtg ccgcctgaagaagtgcttag ctgtaggcat gaggcctgag 240 tgcgtagtac ccgagactca gtgcgccatgaagcggaaag agaagaaagc acagaaggag 300 aaggacaaac tgcctgtcag cacgacgacggtggacgacc acatgccgcc cattatgcag 360 tgtgaacctc cacctcctga agcagcaaggattcacgaag tggtcccaag gtttctctcc 420 gacaagctgt tggagacaaa ccggcagaaaaacatccccc agttgacagc caaccagcag 480 ttccttatcg ccaggctcat ctggtaccaggacgggtacg agcagccttc tgatgaagat 540 ttgaagagga ttacgcagac gtggcagcaagcggacgatg aaaacgaaga gtctgacact 600 cccttccgcc agatcacaga gatgactatcctcacggtcc aacttatcgt ggagttcgcg 660 aagggattgc cagggttcgc caagatctcgcagcctgatc aaattacgct gcttaaggct 720 tgctcaagtg aggtaatgat gctccgagtcgcgcgacgat acgatgcggc ctcagacagt 780 gttctgttcg cgaacaacca agcgtacactcgcgacaact accgcaaggc tggcatggcc 840 tacgtcatcg aggatctact gcacttctgccggtgcatgt actctatggc gttggacaac 900 atccattacg cgctgctcac ggctgtcgtcatcttttctg accggccagg gttggagcag 960 ccgcaactgg tggaagaaat ccagcggtactacctgaata cgctccgcat ctatatcctg 1020 aaccagctga gcgggtcggc gcgttcgtccgtcatatacg gcaagatcct ctcaatcctc 1080 tctgagctac gcacgctcgg catgcaaaactccaacatgt gcatctccct caagctcaag 1140 aacagaaagc tgccgccttt cctcgaggagatctgggatg tggcggacat gtcgcacacc 1200 caaccgccgc ctatcctcga gtcccccacgaatctctagc ccctgcgcgc acgcatcgcc 1260 gatgccgcgt ccggccgcgc tgctctga1288 60 309 DNA Simian virus 40 60 ggtgtggaaa gtccccaggc tccccagcaggcagaagtat gcaaagcatg catctcaatt 60 agtcagcaac caggtgtgga aagtccccaggctccccagc aggcagaagt atgcaaagca 120 tgcatctcaa ttagtcagca accatagtcccgcccctaac tccgcccatc ccgcccctaa 180 ctccgcccag ttccgcccat tctccgccccatggctgact aatttttttt atttatgcag 240 aggccgaggc cgcctcggcc tctgagctattccagaagta gtgaggaggc ttttttggag 300 gcctaggct 309 61 24 DNA ArtificialSequence synthetic E1b minimal promoter 61 tatataatgg atccccgggt accg 2462 1653 DNA Artificial Sequence luciferase gene 62 atggaagacg ccaaaaacataaagaaaggc ccggcgccat tctatcctct agaggatgga 60 accgctggag agcaactgcataaggctatg aagagatacg ccctggttcc tggaacaatt 120 gcttttacag atgcacatatcgaggtgaac atcacgtacg cggaatactt cgaaatgtcc 180 gttcggttgg cagaagctatgaaacgatat gggctgaata caaatcacag aatcgtcgta 240 tgcagtgaaa actctcttcaattctttatg ccggtgttgg gcgcgttatt tatcggagtt 300 gcagttgcgc ccgcgaacgacatttataat gaacgtgaat tgctcaacag tatgaacatt 360 tcgcagccta ccgtagtgtttgtttccaaa aaggggttgc aaaaaatttt gaacgtgcaa 420 aaaaaattac caataatccagaaaattatt atcatggatt ctaaaacgga ttaccaggga 480 tttcagtcga tgtacacgttcgtcacatct catctacctc ccggttttaa tgaatacgat 540 tttgtaccag agtcctttgatcgtgacaaa acaattgcac tgataatgaa ttcctctgga 600 tctactgggt tacctaagggtgtggccctt ccgcatagaa ctgcctgcgt cagattctcg 660 catgccagag atcctatttttggcaatcaa atcattccgg atactgcgat tttaagtgtt 720 gttccattcc atcacggttttggaatgttt actacactcg gatatttgat atgtggattt 780 cgagtcgtct taatgtatagatttgaagaa gagctgtttt tacgatccct tcaggattac 840 aaaattcaaa gtgcgttgctagtaccaacc ctattttcat tcttcgccaa aagcactctg 900 attgacaaat acgatttatctaatttacac gaaattgctt ctgggggcgc acctctttcg 960 aaagaagtcg gggaagcggttgcaaaacgc ttccatcttc cagggatacg acaaggatat 1020 gggctcactg agactacatcagctattctg attacacccg agggggatga taaaccgggc 1080 gcggtcggta aagttgttccattttttgaa gcgaaggttg tggatctgga taccgggaaa 1140 acgctgggcg ttaatcagagaggcgaatta tgtgtcagag gacctatgat tatgtccggt 1200 tatgtaaaca atccggaagcgaccaacgcc ttgattgaca aggatggatg gctacattct 1260 ggagacatag cttactgggacgaagacgaa cacttcttca tagttgaccg cttgaagtct 1320 ttaattaaat acaaaggatatcaggtggcc cccgctgaat tggaatcgat attgttacaa 1380 caccccaaca tcttcgacgcgggcgtggca ggtcttcccg acgatgacgc cggtgaactt 1440 cccgccgccg ttgttgttttggagcacgga aagacgatga cggaaaaaga gatcgtggat 1500 tacgtcgcca gtcaagtaacaaccgcgaaa aagttgcgcg gaggagttgt gtttgtggac 1560 gaagtaccga aaggtcttaccggaaaactc gacgcaagaa aaatcagaga gatcctcata 1620 aaggccaaga agggcggaaagtccaaattg taa 1653 63 18 DNA Mus musculus 63 ccacatcaag ccacctag 18 6418 DNA Locusta migratoria 64 tcaccttctg attcataa 18 65 1054 DNAChoristoneura fumiferana 65 cctgagtgcg tagtacccga gactcagtgc gccatgaagcggaaagagaa gaaagcacag 60 aaggagaagg acaaactgcc tgtcagcacg acgacggtggacgaccacat gccgcccatt 120 atgcagtgtg aacctccacc tcctgaagca gcaaggattcacgaagtggt cccaaggttt 180 ctctccgaca agctgttgga gacaaaccgg cagaaaaacatcccccagtt gacagccaac 240 cagcagttcc ttatcgccag gctcatctgg taccaggacgggtacgagca gccttctgat 300 gaagatttga agaggattac gcagacgtgg cagcaagcggacgatgaaaa cgaagagtct 360 gacactccct tccgccagat cacagagatg actatcctcacggtccaact tatcgtggag 420 ttcgcgaagg gattgccagg gttcgccaag atctcgcagcctgatcaaat tacgctgctt 480 aaggcttgct caagtgaggt aatgatgctc cgagtcgcgcgacgatacga tgcggcctca 540 gacagtgttc tgttcgcgaa caaccaagcg tacactcgcgacaactaccg caaggctggc 600 atggcctacg tcatcgagga tctactgcac ttctgccggtgcatgtactc tatggcgttg 660 gacaacatcc attacgcgct gctcacggct gtcgtcatcttttctgaccg gccagggttg 720 gagcagccgc aactggtgga agaaatccag cggtactacctgaatacgct ccgcatctat 780 atcctgaacc agctgagcgg gtcggcgcgt tcgtccgtcatatacggcaa gatcctctca 840 atcctctctg agctacgcac gctcggcatg caaaactccaacatgtgcat ctccctcaag 900 ctcaagaaca gaaagctgcc gcctttcctc gaggagatctgggatgtggc ggacatgtcg 960 cacacccaac cgccgcctat cctcgagtcc cccacgaatctctagcccct gcgcgcacgc 1020 atcgccgatg ccgcgtccgg ccgcgctgct ctga 1054 66798 DNA Choristoneura fumiferana 66 tcggtgcagg taagcgatga gctgtcaatcgagcgcctaa cggagatgga gtctttggtg 60 gcagatccca gcgaggagtt ccagttcctccgcgtggggc ctgacagcaa cgtgcctcca 120 cgttaccgcg cgcccgtctc ctccctctgccaaataggca acaagcaaat agcggcgttg 180 gtggtatggg cgcgcgacat ccctcatttcgggcagctgg agctggacga tcaagtggta 240 ctcatcaagg cctcctggaa tgagctgctactcttcgcca tcgcctggcg ctctatggag 300 tatttggaag atgagaggga gaacggggacggaacgcgga gcaccactca gccacaactg 360 atgtgtctca tgcctggcat gacgttgcaccgcaactcgg cgcagcaggc gggcgtgggc 420 gccatcttcg accgcgtgct gtccgagctcagtctgaaga tgcgcacctt gcgcatggac 480 caggccgagt acgtcgcgct caaagccatcgtgctgctca accctgatgt gaaaggactg 540 aagaatcggc aagaagttga cgttttgcgagaaaaaatgt tctcttgcct ggacgactac 600 tgccggcggt cgcgaagcaa cgaggaaggccggtttgcgt ccttgctgct gcggctgcca 660 gctctccgct ccatctcgct caagagcttcgaacacctct acttcttcca cctcgtggcc 720 gaaggctcca tcagcggata catacgagaggcgctccgaa accacgcgcc tccgatcgac 780 gtcaatgcca tgatgtaa 798 67 1650 DNADrosophila melanogaster 67 cggccggaat gcgtcgtccc ggagaaccaa tgtgcgatgaagcggcgcga aaagaaggcc 60 cagaaggaga aggacaaaat gaccacttcg ccgagctctcagcatggcgg caatggcagc 120 ttggcctctg gtggcggcca agactttgtt aagaaggagattcttgacct tatgacatgc 180 gagccgcccc agcatgccac tattccgcta ctacctgatgaaatattggc caagtgtcaa 240 gcgcgcaata taccttcctt aacgtacaat cagttggccgttatatacaa gttaatttgg 300 taccaggatg gctatgagca gccatctgaa gaggatctcaggcgtataat gagtcaaccc 360 gatgagaacg agagccaaac ggacgtcagc tttcggcatataaccgagat aaccatactc 420 acggtccagt tgattgttga gtttgctaaa ggtctaccagcgtttacaaa gataccccag 480 gaggaccaga tcacgttact aaaggcctgc tcgtcggaggtgatgatgct gcgtatggca 540 cgacgctatg accacagctc ggactcaata ttcttcgcgaataatagatc atatacgcgg 600 gattcttaca aaatggccgg aatggctgat aacattgaagacctgctgca tttctgccgc 660 caaatgttct cgatgaaggt ggacaacgtc gaatacgcgcttctcactgc cattgtgatc 720 ttctcggacc ggccgggcct ggagaaggcc caactagtcgaagcgatcca gagctactac 780 atcgacacgc tacgcattta tatactcaac cgccactgcggcgactcaat gagcctcgtc 840 ttctacgcaa agctgctctc gatcctcacc gagctgcgtacgctgggcaa ccagaacgcc 900 gagatgtgtt tctcactaaa gctcaaaaac cgcaaactgcccaagttcct cgaggagatc 960 tgggacgttc atgccatccc gccatcggtc cagtcgcaccttcagattac ccaggaggag 1020 aacgagcgtc tcgagcgggc tgagcgtatg cgggcatcggttgggggcgc cattaccgcc 1080 ggcattgatt gcgactctgc ctccacttcg gcggcggcagccgcggccca gcatcagcct 1140 cagcctcagc cccagcccca accctcctcc ctgacccagaacgattccca gcaccagaca 1200 cagccgcagc tacaacctca gctaccacct cagctgcaaggtcaactgca accccagctc 1260 caaccacagc ttcagacgca actccagcca cagattcaaccacagccaca gctccttccc 1320 gtctccgctc ccgtgcccgc ctccgtaacc gcacctggttccttgtccgc ggtcagtacg 1380 agcagcgaat acatgggcgg aagtgcggcc ataggacccatcacgccggc aaccaccagc 1440 agtatcacgg ctgccgttac cgctagctcc accacatcagcggtaccgat gggcaacgga 1500 gttggagtcg gtgttggggt gggcggcaac gtcagcatgtatgcgaacgc ccagacggcg 1560 atggccttga tgggtgtagc cctgcattcg caccaagagcagcttatcgg gggagtggcg 1620 gttaagtcgg agcactcgac gactgcatag 1650 68 1586DNA Bamecia argentifoli 68 gaattcgcgg ccgctcgcaa acttccgtac ctctcaccccctcgccagga ccccccgcca 60 accagttcac cgtcatctcc tccaatggat actcatcccccatgtcttcg ggcagctacg 120 acccttatag tcccaccaat ggaagaatag ggaaagaagagctttcgccg gcgaatagtc 180 tgaacgggta caacgtggat agctgcgatg cgtcgcggaagaagaaggga ggaacgggtc 240 ggcagcagga ggagctgtgt ctcgtctgcg gggaccgcgcctccggctac cactacaacg 300 ccctcacctg cgaaggctgc aagggcttct tccgtcggagcatcaccaag aatgccgtct 360 accagtgtaa atatggaaat aattgtgaaa ttgacatgtacatgaggcga aaatgccaag 420 agtgtcgtct caagaagtgt ctcagcgttg gcatgaggccagaatgtgta gttcccgaat 480 tccagtgtgc tgtgaagcga aaagagaaaa aagcgcaaaaggacaaagat aaacctaact 540 caacgacgag ttgttctcca gatggaatca aacaagagatagatcctcaa aggctggata 600 cagattcgca gctattgtct gtaaatggag ttaaacccattactccagag caagaagagc 660 tcatccatag gctagtttat tttcaaaatg aatatgaacatccatcccca gaggatatca 720 aaaggatagt taatgctgca ccagaagaag aaaatgtagctgaagaaagg tttaggcata 780 ttacagaaat tacaattctc actgtacagt taattgtggaattttctaag cgattacctg 840 gttttgacaa actaattcgt gaagatcaaa tagctttattaaaggcatgt agtagtgaag 900 taatgatgtt tagaatggca aggaggtatg atgctgaaacagattcgata ttgtttgcaa 960 ctaaccagcc gtatacgaga gaatcataca ctgtagctggcatgggtgat actgtggagg 1020 atctgctccg attttgtcga catatgtgtg ccatgaaagtcgataacgca gaatatgctc 1080 ttctcactgc cattgtaatt ttttcagaac gaccatctctaagtgaaggc tggaaggttg 1140 agaagattca agaaatttac atagaagcat taaaagcatatgttgaaaat cgaaggaaac 1200 catatgcaac aaccattttt gctaagttac tatctgttttaactgaacta cgaacattag 1260 ggaatatgaa ttcagaaaca tgcttctcat tgaagctgaagaatagaaag gtgccatcct 1320 tcctcgagga gatttgggat gttgtttcat aaacagtcttacctcaattc catgttactt 1380 ttcatatttg atttatctca gcaggtggct cagtacttatcctcacatta ctgagctcac 1440 ggtatgctca tacaattata acttgtaata tcatatcggtgatgacaaat ttgttacaat 1500 attctttgtt accttaacac aatgttgatc tcataatgatgtatgaattt ttctgttttt 1560 gcaaaaaaaa aagcggccgc gaattc 1586 69 1109 DNANephotetix cincticeps 69 caggaggagc tctgcctgtt gtgcggagac cgagcgtcgggataccacta caacgctctc 60 acctgcgaag gatgcaaggg cttctttcgg aggagtatcaccaaaaacgc agtgtaccag 120 tccaaatacg gcaccaattg tgaaatagac atgtatatgcggcgcaagtg ccaggagtgc 180 cgactcaaga agtgcctcag tgtagggatg aggccagaatgtgtagtacc tgagtatcaa 240 tgtgccgtaa aaaggaaaga gaaaaaagct caaaaggacaaagataaacc tgtctcttca 300 accaatggct cgcctgaaat gagaatagac caggacaaccgttgtgtggt gttgcagagt 360 gaagacaaca ggtacaactc gagtacgccc agtttcggagtcaaacccct cagtccagaa 420 caagaggagc tcatccacag gctcgtctac ttccagaacgagtacgaaca ccctgccgag 480 gaggatctca agcggatcga gaacctcccc tgtgacgacgatgacccgtg tgatgttcgc 540 tacaaacaca ttacggagat cacaatactc acagtccagctcatcgtgga gtttgcgaaa 600 aaactgcctg gtttcgacaa actactgaga gaggaccagatcgtgttgct caaggcgtgt 660 tcgagcgagg tgatgatgct gcggatggcg cggaggtacgacgtccagac agactcgatc 720 ctgttcgcca acaaccagcc gtacacgcga gagtcgtacacgatggcagg cgtgggggaa 780 gtcatcgaag atctgctgcg gttcggccga ctcatgtgctccatgaaggt ggacaatgcc 840 gagtatgctc tgctcacggc catcgtcatc ttctccgagcggccgaacct ggcggaagga 900 tggaaggttg agaagatcca ggagatctac ctggaggcgctcaagtccta cgtggacaac 960 cgagtgaaac ctcgcagtcc gaccatcttc gccaaactgctctccgttct caccgagctg 1020 cgaacactcg gcaaccagaa ctccgagatg tgcttctcgttaaactacgc aaccgcaaac 1080 atgccaccgt tcctcgaaga aatctggga 1109 70 401PRT Choristoneura fumiferana 70 Cys Leu Val Cys Gly Asp Arg Ala Ser GlyTyr His Tyr Asn Ala Leu 1 5 10 15 Thr Cys Glu Gly Cys Lys Gly Phe PheArg Arg Ser Val Thr Lys Asn 20 25 30 Ala Val Tyr Ile Cys Lys Phe Gly HisAla Cys Glu Met Asp Met Tyr 35 40 45 Met Arg Arg Lys Cys Gln Glu Cys ArgLeu Lys Lys Cys Leu Ala Val 50 55 60 Gly Met Arg Pro Glu Cys Val Val ProGlu Thr Gln Cys Ala Met Lys 65 70 75 80 Arg Lys Glu Lys Lys Ala Gln LysGlu Lys Asp Lys Leu Pro Val Ser 85 90 95 Thr Thr Thr Val Asp Asp His MetPro Pro Ile Met Gln Cys Glu Pro 100 105 110 Pro Pro Pro Glu Ala Ala ArgIle His Glu Val Val Pro Arg Phe Leu 115 120 125 Ser Asp Lys Leu Leu GluThr Asn Arg Gln Lys Asn Ile Pro Gln Leu 130 135 140 Thr Ala Asn Gln GlnPhe Leu Ile Ala Arg Leu Ile Trp Tyr Gln Asp 145 150 155 160 Gly Tyr GluGln Pro Ser Asp Glu Asp Leu Lys Arg Ile Thr Gln Thr 165 170 175 Trp GlnGln Ala Asp Asp Glu Asn Glu Glu Ser Asp Thr Pro Phe Arg 180 185 190 GlnIle Thr Glu Met Thr Ile Leu Thr Val Gln Leu Ile Val Glu Phe 195 200 205Ala Lys Gly Leu Pro Gly Phe Ala Lys Ile Ser Gln Pro Asp Gln Ile 210 215220 Thr Leu Leu Lys Ala Cys Ser Ser Glu Val Met Met Leu Arg Val Ala 225230 235 240 Arg Arg Tyr Asp Ala Ala Ser Asp Ser Val Leu Phe Ala Asn AsnGln 245 250 255 Ala Tyr Thr Arg Asp Asn Tyr Arg Lys Ala Gly Met Ala TyrVal Ile 260 265 270 Glu Asp Leu Leu His Phe Cys Arg Cys Met Tyr Ser MetAla Leu Asp 275 280 285 Asn Ile His Tyr Ala Leu Leu Thr Ala Val Val IlePhe Ser Asp Arg 290 295 300 Pro Gly Leu Glu Gln Pro Gln Leu Val Glu GluIle Gln Arg Tyr Tyr 305 310 315 320 Leu Asn Thr Leu Arg Ile Tyr Ile LeuAsn Gln Leu Ser Gly Ser Ala 325 330 335 Arg Ser Ser Val Ile Tyr Gly LysIle Leu Ser Ile Leu Ser Glu Leu 340 345 350 Arg Thr Leu Gly Met Gln AsnSer Asn Met Cys Ile Ser Leu Lys Leu 355 360 365 Lys Asn Arg Lys Leu ProPro Phe Leu Glu Glu Ile Trp Asp Val Ala 370 375 380 Asp Met Ser His ThrGln Pro Pro Pro Ile Leu Glu Ser Pro Thr Asn 385 390 395 400 Leu 71 894DNA Tenebrio molitor 71 aggccggaat gtgtggtacc ggaagtacag tgtgctgttaagagaaaaga gaagaaagcc 60 caaaaggaaa aagataaacc aaacagcact actaacggctcaccagacgt catcaaaatt 120 gaaccagaat tgtcagattc agaaaaaaca ttgactaacggacgcaatag gatatcacca 180 gagcaagagg agctcatact catacatcga ttggtttatttccaaaacga atatgaacat 240 ccgtctgaag aagacgttaa acggattatc aatcagccgatagatggtga agatcagtgt 300 gagatacggt ttaggcatac cacggaaatt acgatcctgactgtgcagct gatcgtggag 360 tttgccaagc ggttaccagg cttcgataag ctcctgcaggaagatcaaat tgctctcttg 420 aaggcatgtt caagcgaagt gatgatgttc aggatggcccgacgttacga cgtccagtcg 480 gattccatcc tcttcgtaaa caaccagcct tatccgagggacagttacaa tttggccggt 540 atgggggaaa ccatcgaaga tctcttgcat ttttgcagaactatgtactc catgaaggtg 600 gataatgccg aatatgcttt actaacagcc atcgttattttctcagagcg accgtcgttg 660 atagaaggct ggaaggtgga gaagatccaa gaaatctatttagaggcatt gcgggcgtac 720 gtcgacaacc gaagaagccc aagccggggc acaatattcgcgaaactcct gtcagtacta 780 actgaattgc ggacgttagg caaccaaaat tcagagatgtgcatctcgtt gaaattgaaa 840 aacaaaaagt taccgccgtt cctggacgaa atctgggacgtcgacttaaa agca 894 72 298 PRT Tenebrio molitor 72 Arg Pro Glu Cys ValVal Pro Glu Val Gln Cys Ala Val Lys Arg Lys 1 5 10 15 Glu Lys Lys AlaGln Lys Glu Lys Asp Lys Pro Asn Ser Thr Thr Asn 20 25 30 Gly Ser Pro AspVal Ile Lys Ile Glu Pro Glu Leu Ser Asp Ser Glu 35 40 45 Lys Thr Leu ThrAsn Gly Arg Asn Arg Ile Ser Pro Glu Gln Glu Glu 50 55 60 Leu Ile Leu IleHis Arg Leu Val Tyr Phe Gln Asn Glu Tyr Glu His 65 70 75 80 Pro Ser GluGlu Asp Val Lys Arg Ile Ile Asn Gln Pro Ile Asp Gly 85 90 95 Glu Asp GlnCys Glu Ile Arg Phe Arg His Thr Thr Glu Ile Thr Ile 100 105 110 Leu ThrVal Gln Leu Ile Val Glu Phe Ala Lys Arg Leu Pro Gly Phe 115 120 125 AspLys Leu Leu Gln Glu Asp Gln Ile Ala Leu Leu Lys Ala Cys Ser 130 135 140Ser Glu Val Met Met Phe Arg Met Ala Arg Arg Tyr Asp Val Gln Ser 145 150155 160 Asp Ser Ile Leu Phe Val Asn Asn Gln Pro Tyr Pro Arg Asp Ser Tyr165 170 175 Asn Leu Ala Gly Met Gly Glu Thr Ile Glu Asp Leu Leu His PheCys 180 185 190 Arg Thr Met Tyr Ser Met Lys Val Asp Asn Ala Glu Tyr AlaLeu Leu 195 200 205 Thr Ala Ile Val Ile Phe Ser Glu Arg Pro Ser Leu IleGlu Gly Trp 210 215 220 Lys Val Glu Lys Ile Gln Glu Ile Tyr Leu Glu AlaLeu Arg Ala Tyr 225 230 235 240 Val Asp Asn Arg Arg Ser Pro Ser Arg GlyThr Ile Phe Ala Lys Leu 245 250 255 Leu Ser Val Leu Thr Glu Leu Arg ThrLeu Gly Asn Gln Asn Ser Glu 260 265 270 Met Cys Ile Ser Leu Lys Leu LysAsn Lys Lys Leu Pro Pro Phe Leu 275 280 285 Asp Glu Ile Trp Asp Val AspLeu Lys Ala 290 295 73 948 DNA Amblyomma americanum 73 cggccggaatgtgtggtgcc ggagtaccag tgtgccatca agcgggagtc taagaagcac 60 cagaaggaccggccaaacag cacaacgcgg gaaagtccct cggcgctgat ggcgccatct 120 tctgtgggtggcgtgagccc caccagccag cccatgggtg gcggaggcag ctccctgggc 180 agcagcaatcacgaggagga taagaagcca gtggtgctca gcccaggagt caagcccctc 240 tcttcatctcaggaggacct catcaacaag ctagtctact accagcagga gtttgagtcg 300 ccttctgaggaagacatgaa gaaaaccacg cccttccccc tgggagacag tgaggaagac 360 aaccagcggcgattccagca cattactgag atcaccatcc tgacagtgca gctcattgtg 420 gagttctccaagcgggtccc tggctttgac acgctggcac gagaagacca gattactttg 480 ctgaaggcctgctccagtga agtgatgatg ctgagaggtg cccggaaata tgatgtgaag 540 acagattctatagtgtttgc caataaccag ccgtacacga gggacaacta ccgcagtgcc 600 agtgtgggggactctgcaga tgccctgttc cgcttctgcc gcaagatgtg tcagctgaga 660 gtagacaacgctgaatacgc actcctgacg gccattgtaa ttttctctga acggccatca 720 ctggtggacccgcacaaggt ggagcgcatc caggagtact acattgagac cctgcgcatg 780 tactccgagaaccaccggcc cccaggcaag aactactttg cccggctgct gtccatcttg 840 acagagctgcgcaccttggg caacatgaac gccgaaatgt gcttctcgct caaggtgcag 900 aacaagaagctgccaccgtt cctggctgag atttgggaca tccaagag 948 74 316 PRT Amblyommaamericanum 74 Arg Pro Glu Cys Val Val Pro Glu Tyr Gln Cys Ala Ile LysArg Glu 1 5 10 15 Ser Lys Lys His Gln Lys Asp Arg Pro Asn Ser Thr ThrArg Glu Ser 20 25 30 Pro Ser Ala Leu Met Ala Pro Ser Ser Val Gly Gly ValSer Pro Thr 35 40 45 Ser Gln Pro Met Gly Gly Gly Gly Ser Ser Leu Gly SerSer Asn His 50 55 60 Glu Glu Asp Lys Lys Pro Val Val Leu Ser Pro Gly ValLys Pro Leu 65 70 75 80 Ser Ser Ser Gln Glu Asp Leu Ile Asn Lys Leu ValTyr Tyr Gln Gln 85 90 95 Glu Phe Glu Ser Pro Ser Glu Glu Asp Met Lys LysThr Thr Pro Phe 100 105 110 Pro Leu Gly Asp Ser Glu Glu Asp Asn Gln ArgArg Phe Gln His Ile 115 120 125 Thr Glu Ile Thr Ile Leu Thr Val Gln LeuIle Val Glu Phe Ser Lys 130 135 140 Arg Val Pro Gly Phe Asp Thr Leu AlaArg Glu Asp Gln Ile Thr Leu 145 150 155 160 Leu Lys Ala Cys Ser Ser GluVal Met Met Leu Arg Gly Ala Arg Lys 165 170 175 Tyr Asp Val Lys Thr AspSer Ile Val Phe Ala Asn Asn Gln Pro Tyr 180 185 190 Thr Arg Asp Asn TyrArg Ser Ala Ser Val Gly Asp Ser Ala Asp Ala 195 200 205 Leu Phe Arg PheCys Arg Lys Met Cys Gln Leu Arg Val Asp Asn Ala 210 215 220 Glu Tyr AlaLeu Leu Thr Ala Ile Val Ile Phe Ser Glu Arg Pro Ser 225 230 235 240 LeuVal Asp Pro His Lys Val Glu Arg Ile Gln Glu Tyr Tyr Ile Glu 245 250 255Thr Leu Arg Met Tyr Ser Glu Asn His Arg Pro Pro Gly Lys Asn Tyr 260 265270 Phe Ala Arg Leu Leu Ser Ile Leu Thr Glu Leu Arg Thr Leu Gly Asn 275280 285 Met Asn Ala Glu Met Cys Phe Ser Leu Lys Val Gln Asn Lys Lys Leu290 295 300 Pro Pro Phe Leu Ala Glu Ile Trp Asp Ile Gln Glu 305 310 31575 825 DNA Drosophila melanogaster 75 gtgtccaggg atttctcgat cgagcgcatcatagaggccg agcagcgagc ggagacccaa 60 tgcggcgatc gtgcactgac gttcctgcgcgttggtccct attccacagt ccagccggac 120 tacaagggtg ccgtgtcggc cctgtgccaagtggtcaaca aacagctctt ccagatggtc 180 gaatacgcgc gcatgatgcc gcactttgcccaggtgccgc tggacgacca ggtgattctg 240 ctgaaagccg cttggatcga gctgctcattgcgaacgtgg cctggtgcag catcgtttcg 300 ctggatgacg gcggtgccgg cggcgggggcggtggactag gccacgatgg ctcctttgag 360 cgacgatcac cgggccttca gccccagcagctgttcctca accagagctt ctcgtaccat 420 cgcaacagtg cgatcaaagc cggtgtgtcagccatcttcg accgcatatt gtcggagctg 480 agtgtaaaga tgaagcggct gaatctcgaccgacgcgagc tgtcctgctt gaaggccatc 540 atactgtaca acccggacat acgcgggatcaagagccggg cggagatcga gatgtgccgc 600 gagaaggtgt acgcttgcct ggacgagcactgccgcctgg aacatccggg cgacgatgga 660 cgctttgcgc aactgctgct gcgtctgcccgctttgcgat cgatcagcct gaagtgccag 720 gatcacctgt tcctcttccg cattaccagcgaccggccgc tggaggagct ctttctcgag 780 cagctggagg cgccgccgcc acccggcctggcgatgaaac tggag 825

We claim:
 1. A gene expression modulation system comprising: a) a firstgene expression cassette that is capable of being expressed in a hostcell comprising a polynucleotide sequence that encodes a first hybridpolypeptide comprising: i) a DNA-binding domain that recognizes aresponse element associated with a gene whose expression is to bemodulated; and ii) an ecdysone receptor ligand binding domain; and b) asecond gene expression cassette that is capable of being expressed inthe host cell comprising a polynucleotide sequence that encodes a secondhybrid polypeptide comprising: i) a transactivation domain; and ii) achimeric retinoid X receptor ligand binding domain.
 2. The geneexpression modulation system according to claim 1, further comprising athird gene expression cassette comprising: i) a response elementrecognized by the DNA-binding domain of the first hybrid polypeptide;ii) a promoter that is activated by the transactivation domain of thesecond hybrid polypeptide; and iii) a gene whose expression is to bemodulated.
 3. The gene expression modulation system according to claim1, wherein the ecdysone receptor ligand binding domain (LBD) of thefirst hybrid polypeptide is selected from the group consisting of aspruce budworm Choristoneura fumiferana EcR (“CfEcR”) LBD, a beetleTenebrio molitor EcR (“TmEcR”) LBD, a Manduca sexta EcR (“MsEcR”) LBD, aHeliothies virescens EcR (“HvEcR”) LBD, a midge Chironomus tentans EcR(“CtEcR”) LBD, a silk moth Bombyx mori EcR (“BmEcR”) LBD, a fruit flyDrosophila melanogaster EcR (“DmEcR”) LBD, a mosquito Aedes aegypti EcR(“AaEcR”) LBD, a blowfly Lucilia capitata EcR (“LcEcR”) LBD, a blowflyLucilia cuprina EcR (“LucEcR”) LBD, a Mediterranean fruit fly Ceratitiscapitata EcR (“CcEcR”) LBD, a locust Locusta migratoria EcR (“LmEcR”)LBD, an aphid Myzus persicae EcR (“MpEcR”) LBD, a fiddler crab Celucapugilator EcR (“CpEcR”) LBD, a whitefly Bamecia argentifoli EcR (BaEcR)LBD, a leafhopper Nephotetix cincticeps EcR (NcEcR) LBD, and an ixodidtick Amblyomma americanum EcR (“AmaEcR”) LBD.
 4. The gene expressionmodulation system according to claim 1, wherein the ecdysone receptorligand binding domain of the first hybrid polypeptide is encoded by apolynucleotide comprising a nucleic acid sequence selected from thegroup consisting of SEQ ID NO: 65 (CfEcR-DEF), SEQ ID NO: 59(CfEcR-CDEF), SEQ ID NO: 67 (DmEcR-DEF), SEQ ID NO: 71 TmEcR-DEF) andSEQ ID NO: 73 (AmaEcR-DEF).
 5. The gene expression modulation systemaccording to claim 1, wherein the ecdysone receptor ligand bindingdomain of the first hybrid polypeptide comprises an amino acid sequenceselected from the group consisting of SEQ ID NO: 57 (CfEcR-DEF), SEQ IDNO: 58 (DmEcR-DEF), SEQ ID NO: 70 (CfEcR-CDEF), SEQ ID NO: 72(TmEcR-DEF) or SEQ ID NO: 74 (AmaEcR-DEF).
 6. The gene expressionmodulation system according to claim 1, wherein the chimeric retinoid Xreceptor ligand binding domain of the second hybrid polypeptidecomprises at least two different retinoid X receptor ligand bindingdomain fragments selected from the group consisting of a vertebratespecies retinoid X receptor ligand binding domain fragment, aninvertebrate species retinoid X receptor ligand binding domain fragment,and a non-Dipteran/non-Lepidopteran invertebrate species retinoid Xreceptor homolog ligand binding domain fragment.
 7. The gene expressionmodulation system according to claim 1, wherein the chimeric retinoid Xreceptor ligand binding domain of the second hybrid polypeptidecomprises a retinoid X receptor ligand binding domain comprising atleast one retinoid X receptor ligand binding domain fragment selectedfrom the group consisting of an EF-domain helix 1, an EF-domain helix 2,an EF-domain helix 3, an EF-domain helix 4, an EF-domain helix 5, anEF-domain helix 6, an EF-domain helix 7, an EF-domain helix 8, andEF-domain helix 9, an EF-domain helix 10, an EF-domain helix 11, anEF-domain helix 12, an F-domain, and an EF-domain 0-pleated sheet,wherein the retinoid X receptor ligand binding domain fragment is from adifferent species retinoid X receptor ligand binding domain or adifferent isoform retinoid X receptor ligand binding domain than theretinoid X receptor ligand binding domain.
 8. The gene expressionmodulation system according to claim 1, wherein the chimeric retinoid Xreceptor ligand binding domain of the second hybrid polypeptide isencoded by a polynucleotide comprising a nucleic acid sequence selectedfrom the group consisting of a) SEQ ID NO: 45, b) nucleotides 1-348 ofSEQ ID NO: 13 and nucleotides 268-630 of SEQ ID NO: 21, c) nucleotides1-408 of SEQ ID NO: 13 and nucleotides 337-630 of SEQ ID NO: 21, d)nucleotides 1-465 of SEQ ID NO: 13 and nucleotides 403-630 of SEQ ID NO:21, e) nucleotides 1-555 of SEQ ID NO: 13 and nucleotides 490-630 of SEQID NO: 21, f) nucleotides 1-624 of SEQ ID NO: 13 and nucleotides 547-630of SEQ ID NO: 21, g) nucleotides 1-645 of SEQ ID NO: 13 and nucleotides601-630 of SEQ ID NO: 21, and h) nucleotides 1-717 of SEQ ID NO: 13 andnucleotides 613-630 of SEQ ID NO:
 21. 9. The gene expression modulationsystem according to claim 1, wherein the chimeric retinoid X receptorligand binding domain of the second hybrid polypeptide comprises anamino acid sequence selected from the group consisting of a) SEQ ID NO:46, b) amino acids 1-116 of SEQ ID NO: 13 and amino acids 90-210 of SEQID NO: 21, c) amino acids 1-136 of SEQ ID NO: 13 and amino acids 113-210of SEQ ID NO: 21, d) amino acids 1-155 of SEQ ID NO: 13 and amino acids135-210 of SEQ ID NO: 21, e) amino acids 1-185 of SEQ ID NO: 13 andamino acids 164-210 of SEQ ID NO: 21, f) amino acids 1-208 of SEQ ID NO:13 and amino acids 183-210 of SEQ ID NO: 21, g) amino acids 1-215 of SEQID NO: 13 and amino acids 201-210 of SEQ ID NO: 21, and h) amino acids1-239 of SEQ ID NO: 13 and amino acids 205-210 of SEQ ID NO:
 21. 10. Thegene expression modulation system according to claim 1, wherein thefirst gene expression cassette comprises a polynucleotide sequence thatencodes the first hybrid polypeptide comprising a DNA-binding domainselected from the group consisting of a GAL4 DNA-binding domain and aLexA DNA-binding domain, and an ecdysone receptor ligand binding domain.11. The gene expression modulation system according to claim 1, whereinthe second gene expression cassette comprises a polynucleotide thatencodes the second hybrid polypeptide comprising a transactivationdomain selected from the group consisting of a VP16 transactivationdomain and a B42 acidic activator transactivation domain, and a chimericretinoid X receptor ligand binding domain.
 12. The gene expressionmodulation system according to claim 1, wherein the second geneexpression cassette comprises a polynucleotide that encodes the secondhybrid polypeptide comprising a transactivation domain encoded by apolynucleotide comprising a nucleic acid sequence selected from thegroup consisting of a VP16 AD (SEQ ID NO: 51) and a B42 AD (SEQ ID NO:53), and a chimeric retinoid X receptor ligand binding domain encoded bya polynucleotide comprising a nucleic acid sequence selected from thegroup consisting of a) SEQ ID NO: 45, b) nucleotides 1-348 of SEQ ID NO:13 and nucleotides 268-630 of SEQ ID NO: 21, c) nucleotides 1-408 of SEQID NO: 13 and nucleotides 337-630 of SEQ ID NO: 21, d) nucleotides 1-465of SEQ ID NO: 13 and nucleotides 403-630 of SEQ ID NO: 21, e)nucleotides 1-555 of SEQ ID NO: 13 and nucleotides 490-630 of SEQ ID NO:21, f) nucleotides 1-624 of SEQ ID NO: 13 and nucleotides 547-630 of SEQID NO: 21, g) nucleotides 1-645 of SEQ ID NO: 13 and nucleotides 601-630of SEQ ID NO: 21, and h) nucleotides 1-717 of SEQ ID NO: 13 andnucleotides 613-630 of SEQ ID NO:
 21. 13. The gene expression modulationsystem according to claim 1, wherein the second gene expression cassettecomprises a polynucleotide that encodes the second hybrid polypeptidecomprising a transactivation domain comprising an amino acid sequenceselected from the group consisting of a VP16 AD (SEQ ID NO: 52) and aB42 AD (SEQ ID NO: 54), and a chimeric retinoid X receptor ligandbinding domain comprising an amino acid sequence selected from the groupconsisting of a) SEQ ID NO: 46, b) amino acids 1-116 of SEQ ID NO: 13and amino acids 90-210 of SEQ ID NO: 21, c) amino acids 1-136 of SEQ IDNO: 13 and amino acids 113-210 of SEQ ID NO: 21, d) amino acids 1-155 ofSEQ ID NO: 13 and amino acids 135-210 of SEQ ID NO: 21, e) amino acids1-185 of SEQ ID NO: 13 and amino acids 164-210 of SEQ ID NO: 21, f)amino acids 1-208 of SEQ ID NO: 13 and amino acids 183-210 of SEQ ID NO:21, g) amino acids 1-215 of SEQ ID NO: 13 and amino acids 201-210 of SEQID NO: 21, and h) amino acids 1-239 of SEQ ID NO: 13 and amino acids205-210 of SEQ ID NO:
 21. 14. A gene expression modulation systemcomprising: a) a first gene expression cassette that is capable of beingexpressed in a host cell comprising a polynucleotide sequence thatencodes a first hybrid polypeptide comprising: i) a DNA-binding domainthat recognizes a response element associated with a gene whoseexpression is to be modulated; and ii) a chimeric retinoid X receptorligand binding domain; and b) a second gene expression cassette that iscapable of being expressed in the host cell comprising a polynucleotidesequence that encodes a second hybrid polypeptide comprising: i) atransactivation domain; and ii) an ecdysone receptor ligand bindingdomain.
 15. The gene expression modulation system according to claim 14,further comprising a third gene expression cassette comprising: i) aresponse element that recognizes the DNA-binding domain of the firsthybrid polypeptide; ii) a promoter that is activated by thetransactivation domain of the second hybrid polypeptide; and iii) a genewhose expression is to be modulated.
 16. The gene expression modulationsystem according to claim 14, wherein the chimeric retinoid X receptorligand binding domain of the first hybrid polypeptide comprises at leasttwo different retinoid X receptor ligand binding domain fragmentsselected from the group consisting of a vertebrate species retinoid Xreceptor ligand binding domain fragment, an invertebrate speciesretinoid X receptor ligand binding domain fragment, and anon-Dipteran/non-Lepidopteran invertebrate species retinoid X receptorhomolog ligand binding domain fragment.
 17. The gene expressionmodulation system according to claim 14, wherein the chimeric retinoid Xreceptor ligand binding domain of the first hybrid polypeptide comprisesa retinoid X receptor ligand binding domain comprising at least oneretinoid X receptor ligand binding domain fragment selected from thegroup consisting of an EF-domain helix 1, an EF-domain helix 2, anEF-domain helix 3, an EF-domain helix 4, an EF-domain helix 5, anEF-domain helix 6, an EF-domain helix 7, an EF-domain helix 8, andEF-domain helix 9, an EF-domain helix 10, an EF-domain helix 11, anEF-domain helix 12, an F-domain, and an EF-domain β-pleated sheet,wherein the retinoid X receptor ligand binding domain fragment is from adifferent species retinoid X receptor ligand binding domain or adifferent isoform retinoid X receptor ligand binding domain than theretinoid X receptor ligand binding domain.
 18. The gene expressionmodulation system according to claim 14, wherein the chimeric retinoid Xreceptor ligand binding domain of the first hybrid polypeptide isencoded by a polynucleotide comprising a nucleic acid sequence selectedfrom the group consisting of a) SEQ ID NO: 45, b) nucleotides 1-348 ofSEQ ID NO: 13 and nucleotides 268-630 of SEQ ID NO: 21, c) nucleotides1-408 of SEQ ID NO: 13 and nucleotides 337-630 of SEQ ID NO: 21, d)nucleotides 1-465 of SEQ ID NO: 13 and nucleotides 403-630 of SEQ ID NO:21, e) nucleotides 1-555 of SEQ ID NO: 13 and nucleotides 490-630 of SEQID NO: 21, f) nucleotides 1-624 of SEQ ID NO: 13 and nucleotides 547-630of SEQ ID NO: 21, g) nucleotides 1-645 of SEQ ID NO: 13 and nucleotides601-630 of SEQ ID NO: 21, and h) nucleotides 1-717 of SEQ ID NO: 13 andnucleotides 613-630 of SEQ ID NO:
 21. 19. The gene expression modulationsystem according to claim 14, wherein the chimeric retinoid X receptorligand binding domain of the first hybrid polypeptide comprises an aminoacid sequence selected from the group consisting of a) SEQ ID NO: 46, b)amino acids 1-116 of SEQ ID NO: 13 and amino acids 90-210 of SEQ ID NO:21, c) amino acids 1-136 of SEQ ID NO: 13 and amino acids 113-210 of SEQID NO: 21, d) amino acids 1-155 of SEQ ID NO: 13 and amino acids 135-210of SEQ ID NO: 21, e) amino acids 1-185 of SEQ ID NO: 13 and amino acids164-210 of SEQ ID NO: 21, f) amino acids 1-208 of SEQ ID NO: 13 andamino acids 183-210 of SEQ ID NO: 21, g) amino acids 1-215 of SEQ ID NO:13 and amino acids 201-210 of SEQ ID NO: 21, and h) amino acids 1-239 ofSEQ ID NO: 13 and amino acids 205-210 of SEQ ID NO:
 21. 20. The geneexpression modulation system according to claim 14, wherein the ecdysonereceptor ligand binding domain of the second hybrid polypeptide isencoded by a polynucleotide comprising a nucleic acid sequence selectedfrom the group consisting of SEQ ID NO: 65 (CfEcR-DEF), SEQ ID NO: 59(CfEcR-CDEF), SEQ ID NO: 67 (DmEcR-DEF), SEQ ID NO: 71 (TmEcR-DEF) andSEQ ID NO: 73 (AmaEcR-DEF).
 21. The gene expression modulation systemaccording to claim 14, wherein the ecdysone receptor ligand bindingdomain of the second hybrid polypeptide comprises an amino acid sequenceselected from the group consisting of SEQ ID NO: 57 (CfEcR-DEF), SEQ IDNO: 58 (DmEcR-DEF), SEQ ID NO: 70 (CfEcR-CDEF), SEQ ID NO: 72(TmEcR-DEF) or SEQ ID NO: 74 (AmaEcR-DEF).
 22. The gene expressionmodulation system according to claim 14, wherein the first geneexpression cassette comprises a polynucleotide that encodes the firsthybrid polypeptide comprising a DNA-binding domain selected from thegroup consisting of a GALA DNA-binding domain and a LexA DNA-bindingdomain, and a chimeric retinoid X receptor ligand binding domain. 23.The gene expression modulation system according to claim 14, wherein thefirst gene expression cassette comprises a polynucleotide that encodesthe first hybrid polypeptide comprising a DNA-binding domain encoded bya polynucleotide comprising a nucleic acid sequence selected from thegroup consisting of a GAL4 DBD (SEQ ID NO: 47) and a LexA DBD (SEQ IDNO: 49), and a chimeric retinoid X receptor ligand binding domainencoded by a polynucleotide comprising a nucleic acid sequence selectedfrom the group consisting of a) SEQ ID NO: 45, b) nucleotides 1-348 ofSEQ ID NO: 13 and nucleotides 268-630 of SEQ ID NO: 21, c) nucleotides1-408 of SEQ ID NO: 13 and nucleotides 337-630 of SEQ ID NO: 21, d)nucleotides 1-465 of SEQ ID NO: 13 and nucleotides 403-630 of SEQ ID NO:21, e) nucleotides 1-555 of SEQ ID NO: 13 and nucleotides 490-630 of SEQID NO: 21, f) nucleotides 1-624 of SEQ ID NO: 13 and nucleotides 547-630of SEQ ID NO: 21, g) nucleotides 1-645 of SEQ ID NO: 13 and nucleotides601-630 of SEQ ID NO: 21, and h) nucleotides 1-717 of SEQ ID NO: 13 andnucleotides 613-630 of SEQ ID NO:
 21. 24. The gene expression modulationsystem according to claim 14, wherein the first gene expression cassettecomprises a polynucleotide that encodes the first hybrid polypeptidecomprising a DNA-binding domain comprising an amino acid sequenceselected from the group consisting of a GAL4 DBD (SEQ ID NO: 48) and aLexA DBD (SEQ ID NO: 50), and a chimeric retinoid X receptor ligandbinding domain comprising an amino acid sequence selected from the groupconsisting of a) SEQ ID NO: 46, b) amino acids 1-116 of SEQ ID NO: 13and amino acids 90-210 of SEQ ID NO: 21, c) amino acids 1-136 of SEQ IDNO: 13 and amino acids 113-210 of SEQ ID NO: 21, d) amino acids 1-155 ofSEQ ID NO: 13 and amino acids 135-210 of SEQ ID NO: 21, e) amino acids1-185 of SEQ ID NO: 13 and amino acids 164-210 of SEQ ID NO: 21, f)amino acids 1-208 of SEQ ID NO: 13 and amino acids 183-210 of SEQ ID NO:21, g) amino acids 1-215 of SEQ ID NO: 13 and amino acids 201-210 of SEQID NO: 21, and h) amino acids 1-239 of SEQ ID NO: 13 and amino acids205-210 of SEQ ID NO:
 21. 25. The gene expression modulation systemaccording to claim 14, wherein the second gene expression cassettecomprises a polynucleotide that encodes the second hybrid polypeptidecomprising a transactivation domain selected from the group consistingof a VP16 transactivation domain and a B42 acidic activatortransactivation domain, and an ecdysone receptor ligand binding domain.26. A gene expression cassette comprising a polynucleotide encoding ahybrid polypeptide comprising a DNA-binding domain and a chimericretinoid X receptor ligand binding domain.
 27. The gene expressioncassette according to claim 26, wherein the chimeric retinoid X receptorligand binding domain comprises at least two different retinoid Xreceptor ligand binding domain fragments selected from the groupconsisting of a vertebrate species retinoid X receptor ligand bindingdomain fragment, an invertebrate species retinoid X receptor ligandbinding domain fragment, and a non-Dipteran/non-Lepidopteraninvertebrate species retinoid X receptor homolog ligand binding domainfragment.
 28. The gene expression cassette according to claim 26,wherein the chimeric retinoid X receptor ligand binding domain comprisesa retinoid X receptor ligand binding domain comprising at least oneretinoid X receptor ligand binding domain fragment selected from thegroup consisting of an EF-domain helix 1, an EF-domain helix 2, anEF-domain helix 3, an EF-domain helix 4, an EF-domain helix 5, anEF-domain helix 6, an EF-domain helix 7, an EF-domain helix 8, andEF-domain helix 9, an EF-domain helix 10, an EF-domain helix 11, anEF-domain helix 12, an F-domain, and an EF-domain β-pleated sheet,wherein the retinoid X receptor ligand binding domain fragment is from adifferent species retinoid X receptor ligand binding domain or adifferent isoform retinoid X receptor ligand binding domain than theretinoid X receptor ligand binding domain.
 29. The gene expressioncassette according to claim 26, wherein the DNA-binding domain is a GAL4DNA-binding domain or a LexA DNA-binding domain.
 30. The gene expressioncassette according to claim 26, wherein the gene expression cassettecomprises a polynucleotide encoding a hybrid polypeptide comprising aDNA-binding domain encoded by a polynucleotide comprising a nucleic acidsequence selected from the group consisting of a GAL4 DBD (SEQ ID NO:47) and a LexA DBD (SEQ ID NO: 49), and a chimeric retinoid X receptorligand binding domain encoded by a polynucleotide comprising a nucleicacid sequence selected from the group consisting of a) SEQ ID NO: 45, b)nucleotides 1-348 of SEQ ID NO: 13 and nucleotides 268-630 of SEQ ID NO:21, c) nucleotides 1-408 of SEQ ID NO: 13 and nucleotides 337-630 of SEQID NO: 21, d) nucleotides 1-465 of SEQ ID NO: 13 and nucleotides 403-630of SEQ ID NO: 21, e) nucleotides 1-555 of SEQ ID NO: 13 and nucleotides490-630 of SEQ ID NO: 21, f) nucleotides 1-624 of SEQ ID NO: 13 andnucleotides 547-630 of SEQ ID NO: 21, g) nucleotides 1-645 of SEQ ID NO:13 and nucleotides 601-630 of SEQ ID NO: 21, and h) nucleotides 1-717 ofSEQ ID NO: 13 and nucleotides 613-630 of SEQ ID NO:
 21. 31. The geneexpression cassette according to claim 26, wherein the gene expressioncassette comprises a polynucleotide encoding a hybrid polypeptidecomprising a DNA-binding domain comprising an amino acid sequenceselected from the group consisting of a GAL4 DBD (SEQ ID NO:. 48) and aLexA DBD (SEQ ID NO: 50), and a chimeric retinoid X receptor ligandbinding domain comprising an amino acid sequence selected from the groupconsisting of a) SEQ ID NO: 46, b) amino acids 1-116 of SEQ ID NO: 13and amino acids 90-210 of SEQ ID NO: 21, c) amino acids 1-136 of SEQ IDNO: 13 and amino acids 113-210 of SEQ ID NO: 21, d) amino acids 1-155 ofSEQ ID NO: 13 and amino acids 135-210 of SEQ ID NO: 21, e) amino acids1-185 of SEQ ID NO: 13 and amino acids 164-210 of SEQ ID NO: 21, f)amino acids 1-208 of SEQ ID NO: 13 and amino acids 183-210 of SEQ ID NO:21, g) amino acids 1-215 of SEQ ID NO: 13 and amino acids 201-210 of SEQID NO: 21, and h) amino acids 1-239of SEQ ID NO:13 and amino acids205-210of SEQ ID NO:21.
 32. A gene expression cassette comprising apolynucleotide encoding a hybrid polypeptide comprising atransactivation domain and a chimeric retinoid X receptor ligand bindingdomain.
 33. The gene expression cassette according to claim 32, whereinthe transactivation domain is a VP16 transactivation domain or a B42acidic activator transactivation domain.
 34. The gene expressioncassette according to claim 32, wherein the gene expression cassettecomprises a polynucleotide encoding a hybrid polypeptide comprising atransactivation domain encoded by a polynucleotide comprising a nucleicacid sequence selected from the group consisting of a VP16 AD (SEQ IDNO: 51) and a B42 AD (SEQ ID NO: 53), and a chimeric retinoid X receptorligand binding domain encoded by a polynucleotide comprising a nucleicacid sequence selected from the group consisting of a) SEQ ID NO: 45, b)nucleotides 1-348 of SEQ ID NO: 13 and nucleotides 268-630 of SEQ ID NO:21, c) nucleotides 1-408 of SEQ ID NO: 13 and nucleotides 337-630 of SEQID NO: 21, d) nucleotides 1-465 of SEQ ID NO: 13 and nucleotides 403-630of SEQ ID NO: 21, e) nucleotides 1-555 of SEQ ID NO: 13 and nucleotides490-630 of SEQ ID NO: 21, f) nucleotides 1-624 of SEQ ID NO: 13 andnucleotides 547-630 of SEQ ID NO: 21, g) nucleotides 1-645 of SEQ ID NO:13 and nucleotides 601-630 of SEQ ID NO: 21, and h) nucleotides 1-717 ofSEQ ID NO: 13 and nucleotides 613-630 of SEQ ID NO:
 21. 35. The geneexpression cassette according to claim 32, wherein the gene expressioncassette comprises a polynucleotide encoding a hybrid polypeptidecomprising a transactivation domain comprising an amino acid sequenceselected from the group consisting of a VP16 AD (SEQ ID NO: 52) and aB42 AD (SEQ ID NO: 54), and a chimeric retinoid X receptor ligandbinding domain comprising an amino acid sequence selected from the groupconsisting of a) SEQ ID NO: 46, b) amino acids 1-116 of SEQ ID NO: 13and amino acids 90-210 of SEQ ID NO: 21, c) amino acids 1-136 of SEQ IDNO: 13 and amino acids 113-210 of SEQ ID NO: 21, d) amino acids 1-155 ofSEQ ID NO: 13 and amino acids 135-210 of SEQ ID NO: 21, e) amino acids1-185 of SEQ ID NO: 13 and amino acids 164-210 of SEQ ID NO: 21, f)amino acids 1-208 of SEQ ID NO: 13 and amino acids 183-210 of SEQ ID NO:21, g) amino acids 1-215 of SEQ ID NO: 13 and amino acids 201-210 of SEQID NO: 21, and h) amino acids 1-239 of SEQ ID NO: 13 and amino acids205-210 of SEQ ID NO:
 21. 36. An isolated polynucleotide encoding atruncated chimeric retinoid X receptor ligand binding domain comprisinga truncation mutation, wherein the truncation mutation reduces ligandbinding activity of the truncated chimeric retinoid X receptor ligandbinding domain.
 37. An isolated polynucleotide encoding a truncatedchimeric retinoid X receptor ligand binding domain comprising atruncation mutation, wherein the truncation mutation reduces steroidbinding activity of the truncated chimeric retinoid X receptor ligandbinding domain.
 38. An isolated polynucleotide encoding a truncatedchimeric retinoid X receptor ligand binding domain comprising atruncation mutation, wherein the truncation mutation reduces non-steroidbinding activity of the truncated chimeric retinoid X receptor ligandbinding domain.
 39. An isolated polynucleotide encoding a truncatedchimeric retinoid X receptor ligand binding domain comprising atruncation mutation, wherein the truncation mutation enhances ligandbinding activity of the truncated chimeric retinoid X receptor ligandbinding domain.
 40. An isolated polynucleotide encoding a truncatedchimeric retinoid X receptor ligand binding domain comprising atruncation mutation, wherein the truncation mutation enhances steroidbinding activity of the truncated chimeric retinoid X receptor ligandbinding domain.
 41. An isolated polynucleotide encoding a truncatedchimeric retinoid X receptor ligand binding domain comprising atruncation mutation, wherein the truncation mutation enhancesnon-steroid binding activity of the truncated chimeric retinoid Xreceptor ligand binding domain.
 42. An isolated polynucleotide encodinga truncated chimeric retinoid X receptor ligand binding domaincomprising a truncation mutation, wherein the truncation mutationincreases ligand sensitivity of the truncated chimeric retinoid Xreceptor ligand binding domain.
 43. An isolated polynucleotide encodinga truncated chimeric retinoid X receptor ligand binding domaincomprising a truncation mutation, wherein the truncation mutationincreases ligand sensitivity of a heterodimer, wherein the heterodimercomprises the truncated chimeric retinoid X receptor ligand bindingdomain and a dimerization partner.
 44. The isolated polynucleotideaccording to claim 43, wherein the dimerization partner is an ecdysonereceptor polypeptide.
 45. An isolated polynucleotide encoding a chimericretinoid X receptor ligand binding domain, wherein the polynucleotidecomprises a nucleic acid sequence selected from the group consisting ofa) SEQ ID NO: 45, b) nucleotides 1-348 of SEQ ID NO: 13 and nucleotides268-630 of SEQ ID NO: 21, c) nucleotides 1-408 of SEQ ID NO: 13 andnucleotides 337-630 of SEQ ID NO: 21, d) nucleotides 1-465 of SEQ ID NO:13 and nucleotides 403-630 of SEQ ID NO: 21, e) nucleotides 1-555 of SEQID NO: 13 and nucleotides 490-630 of SEQ ID NO: 21, f) nucleotides 1-624of SEQ ID NO: 13 and nucleotides 547-630 of SEQ ID NO: 21, g)nucleotides 1-645 of SEQ ID NO: 13 and nucleotides 601-630 of SEQ ID NO:21, and h) nucleotides 1-717 of SEQ ID NO: 13 and nucleotides 613-630 ofSEQ ID NO:
 21. 46. An isolated polypeptide encoded by the isolatedpolynucleotide according to claim
 45. 47. An isolated chimeric retinoidX receptor polypeptide comprising an amino acid sequence selected fromthe group consisting of a) SEQ ID NO: 46, b) amino acids 1-116 of SEQ IDNO: 13 and amino acids 90-210 of SEQ ID NO: 21, c) amino acids 1-136 ofSEQ ID NO: 13 and amino acids 113-210 of SEQ ID NO: 21, d) amino acids1-155 of SEQ ID NO: 13 and amino acids 135-210 of SEQ ID NO: 21, e)amino acids 1-185 of SEQ ID NO: 13 and amino acids 164-210 of SEQ ID NO:21, f) amino acids 1-208 of SEQ ID NO: 13 and amino acids 183-210 of SEQID NO: 21, g) amino acids 1-215 of SEQ ID NO: 13 and amino acids 201-210of SEQ ID NO: 21, and h) amino acids 1-239 of SEQ ID NO: 13 and aminoacids 205-210 of SEQ ID NO:
 21. 48. A method of modulating theexpression of a gene in a host cell comprising the gene to be modulatedcomprising the steps of: a) introducing into the host cell the geneexpression modulation system according to claim 1; and b) introducinginto the host cell a ligand; wherein the gene to be modulated is acomponent of a gene expression cassette comprising: i) a responseelement recognized by the DNA binding domain from the first hybridpolypeptide binds; ii) a promoter that is activated by thetransactivation domain of the second hybrid polypeptide; and iii) a genewhose expression is to be modulated; whereby upon introduction of theligand into the host cell, expression of the gene of b)iii) ismodulated.
 49. The method according to claim 48, wherein the ligand is acompound of the formula:

wherein: E is a (C₄-C₆)alkyl containing a tertiary carbon or acyano(C₃-C₅)alkyl containing a tertiary carbon; R¹ is H, Me, Et, i-Pr,F, formyl, CF₃, CHF₂, CHCl₂, CH₂F, CH₂Cl, CH₂OH, CH₂OMe, CH₂CN, CN,C°CH, 1-propynyl, 2-propynyl, vinyl, OH, OMe, OEt, cyclopropyl, CF₂CF₃,CH═CHCN, allyl, azido, SCN, or SCHF₂; R² is H, Me, Et, n-Pr, i-Pr,formyl, CF₃, CHF₂, CHCl₂, CH₂F, CH₂Cl, CH₂OH, CH₂OMe, CH₂CN, CN, C° CH,1-propynyl, 2-propynyl, vinyl, Ac, F, Cl, OH, OMe, OEt, O-n-Pr, OAc,NMe₂, NEt₂, SMe, SEt, SOCF₃, OCF₂CF₂H, COEt, cyclopropyl, CF₂CF₃,CH═CHCN, allyl, azido, OCF₃, OCHF₂, O-i-Pr, SCN, SCHF₂, SOMe, NH—CN, orjoined with R³ and the phenyl carbons to which R² and R³ are attached toform an ethylenedioxy, a dihydrofuryl ring with the oxygen adjacent to aphenyl carbon, or a dihydropyryl ring with the oxygen adjacent to aphenyl carbon; R³ is H, Et, or joined with R² and the phenyl carbons towhich R² and R³ are attached to form an ethylenedioxy, a dihydrofurylring with the oxygen adjacent to a phenyl carbon, or a dihydropyryl ringwith the oxygen adjacent to a phenyl carbon; R⁴, R⁵, and R⁶ areindependently H, Me, Et, F, Cl, Br, formyl, CF₃, CHF₂, CHCl₂, CH₂F,CH₂Cl, CH₂OH, CN, C°CH, 1-propynyl, 2-propynyl, vinyl, OMe, OEt, SMe, orSEt.
 50. The method according to claim 48, further comprisingintroducing into the host cell a second ligand, wherein the secondligand is 9-cis-retinoic acid or a synthetic analog of a retinoic acid.51. A method of modulating the expression of a gene in a host cellcomprising the gene to be modulated comprising the steps of: a)introducing into the host cell the gene expression modulation system ofclaim 14; and b) introducing into the host cell a ligand; wherein thegene to be modulated is a component of a gene expression cassettecomprising: i) a response element recognized by the DNA binding domainfrom the first hybrid polypeptide; ii) a promoter that is activated bythe transactivation domain of the second hybrid polypeptide; and iii) agene whose expression is to be modulated; whereby upon introduction ofthe ligand into the host cell, expression of the gene of b)iii) ismodulated.
 52. The method according to claim 51, wherein the ligand is acompound of the formula:

wherein: E is a (C₄-C₆)alkyl containing a tertiary carbon or acyano(C₃-C₅)alkyl containing a tertiary carbon; R¹ is H, Me, Et, i-Pr,F, formyl, CF₃, CHF₂, CHCl₂, CH₂F, CH₂Cl, CH₂OH, CH₂OMe, CH₂CN, CN,C°CH, 1-propynyl, 2-propynyl, vinyl, OH, OMe, OEt, cyclopropyl, CF₂CF₃,CH═CHCN, allyl, azido, SCN, or SCHF₂; R² is H, Me, Et, n-Pr, i-Pr,formyl, CF₃, CHF₂, CHCl₂, CH₂F, CH₂Cl, CH₂OH, CH₂OMe, CH₂CN, CN, C°CH,1-propynyl, 2-propynyl, vinyl, Ac, F, Cl, OH, OMe, OEt, 0-n-Pr, OAc,NMe₂, NEt₂, SMe, SEt, SOCF₃, OCF₂CF₂H, COEt, cyclopropyl, CF₂CF₃,CH═CHCN, allyl, azido, OCF₃, OCHF₂, O-i-Pr, SCN, SCHF₂, SOMe, NH—CN, orjoined with R³ and the phenyl carbons to which R² and R³ are attached toform an ethylenedioxy, a dihydrofuryl ring with the oxygen adjacent to aphenyl carbon, or a dihydropyryl ring with the oxygen adjacent to aphenyl carbon; R³ is H, Et, or joined with R² and the phenyl carbons towhich R² and R³ are attached to form an ethylenedioxy, a dihydrofurylring with the oxygen adjacent to a phenyl carbon, or a dihydropyryl ringwith the oxygen adjacent to a phenyl carbon; R⁴, R⁵, and R⁶ areindependently H, Me, Et, F, Cl, Br, formyl, CF₃, CHF₂, CHCl₂, CH₂F,CH₂Cl, CH₂OH, CN, C°CH, 1-propynyl, 2-propynyl, vinyl, OMe, OEt, SMe, orSEt.
 53. The method according to claim 51, further comprisingintroducing into the host cell a second ligand, wherein the secondligand is 9-cis-retinoic acid or a synthetic analog of a retinoic acid.54. An isolated host cell comprising the gene expression modulationsystem according to claim
 1. 55. The isolated host cell according toclaim 54, wherein the host cell is selected from the group consisting ofa bacterial cell, a fungal cell, a yeast cell, an animal cell, and amammalian cell.
 56. The isolated host cell according to claim 55,wherein the mammalian cell is a murine cell or a human cell.
 57. Anisolated host cell comprising the gene expression modulation systemaccording to claim
 14. 58. The isolated host cell according to claim 57,wherein the host cell is selected from the group consisting of abacterial cell, a fungal cell, a yeast cell, an animal cell, and amammalian cell.
 59. The isolated host cell according to claim 58,wherein the mammalian cell is a murine cell or a human cell.
 60. Anon-human organism comprising the host cell of claim
 54. 61. Thenon-human organism according to claim 60, wherein the non-human organismis selected from the group consisting of a bacterium, a fungus, a yeast,an animal, and a mammal.
 62. The non-human organism according to claim61, wherein the mammal is selected from the group consisting of a mouse,a rat, a rabbit, a cat, a dog, a bovine, a goat, a pig, a horse, asheep, a monkey, and a chimpanzee.
 63. A non-human organism comprisingthe host cell of claim
 57. 64. The non-human organism according to claim63, wherein the non-human organism is selected from the group consistingof a bacterium, a fungus, a yeast, an animal, and a mammal.
 65. Thenon-human organism according to claim 64, wherein the mammal is selectedfrom the group consisting of a mouse, a rat, a rabbit, a cat, a dog, abovine, a goat, a pig, a horse, a sheep, a monkey, and a chimpanzee.